2013-06-25 16:20:43

by Mika Westerberg

[permalink] [raw]
Subject: [PATCH 0/6] Thunderbolt workarounds take 2

Hi all,

This is the second iteration of the patches first posted here quite a while
ago:

https://lkml.org/lkml/2012/12/13/589

With Thunderbolt devices we might see pretty complex PCIe hierarchies that
the current ACPI hotplug code was not able to handle properly but instead
failed to scan the whole hierarchy. This series tries to address these
problems. In addition to that there's a possibility that only part of the
chain is disconnected so we don't want to leave behind stale PCI devices.

We have tested this on Acer Aspire S5 with the largest chain looking like
this:

PC +--+ eSata Hub #0 +--+ eSata Hub #1 +--+ Apple Thunderbolt display +--+ Apple ethernet dongle

There are still problems with different device drivers because of the
surprise hotplug nature of Thunderbolt, causing occasional hangs and
failures that should be fixed per-driver later on (not addressed in this
series).

Currently Thunderbolt on Apple hardware like Macbooks is not yet supported
as they use some different mechanism than ACPI events to trigger hotplug
events.

The series is based on linux-pm.git/linux-next.

Kirill A. Shutemov (4):
PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()
PCI: acpiphp: enable_device(): rescan even if no new devices on slot
PCI: introduce pci_trim_stale_devices()
PCI: acpiphp: check for new devices on enabled host

Mika Westerberg (2):
PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy
x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

arch/x86/pci/fixup.c | 51 ++++++++++++++++++++++++
drivers/pci/hotplug/acpi_pcihp.c | 53 ++++++++++++++++++++++---
drivers/pci/hotplug/acpiphp_glue.c | 79 +++++++++++++++++---------------------
drivers/pci/remove.c | 20 ++++++++++
include/linux/pci.h | 1 +
5 files changed, 156 insertions(+), 48 deletions(-)

--
1.8.3.1


2013-06-25 16:19:25

by Mika Westerberg

[permalink] [raw]
Subject: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

From: "Kirill A. Shutemov" <[email protected]>

With Thunderbolt you can chain devices: connect a new devices to plugged
one. In this case the slot is already enabled, but we still want to look
for new devices behind it.

We're going to reuse enable_device() for rescan for new devices on the
enabled slot. Let's push the check up by stack.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
---
drivers/pci/hotplug/acpiphp_glue.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 59df857..b983e29 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -688,9 +688,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
int num, max, pass;
LIST_HEAD(add_list);

- if (slot->flags & SLOT_ENABLED)
- goto err_exit;
-
list_for_each_entry(func, &slot->funcs, sibling)
acpiphp_bus_add(func);

@@ -1242,6 +1239,8 @@ int acpiphp_enable_slot(struct acpiphp_slot *slot)
goto err_exit;

if (get_slot_status(slot) == ACPI_STA_ALL) {
+ if (slot->flags & SLOT_ENABLED)
+ goto err_exit;
/* configure all functions */
retval = enable_device(slot);
if (retval)
--
1.8.3.1

2013-06-25 16:19:31

by Mika Westerberg

[permalink] [raw]
Subject: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
This means that the BIOS will allocate bridge resources based on some
assumptions of a maximum Thunderbolt chain. It also disables native PCIe
hotplug of the root port where the Thunderbolt host router is connected.

In order to support this we must make sure that the kernel does not try to
be too smart and resize / open the bridge windows during PCI enumeration.
For example by default the kernel will add certain amount of space to the
bridge memory/io windows (this is configurable via pci=hp[mem|io]size=xxx
command line option). Eventually we run out of space that the BIOS has
allocated.

Also address space for expansion ROMs should not be allocated (BIOS does
not execute them for Thunderbolt endpoints). If we don't prevent this the
kernel might find expansion ROM associated with some endpoint and reopen
the bridge window which the BIOS already closed leading again resource
exhaustion.

Fix this by adding a quirk that matches known Thunderbolt PCI-to-PCI
bridges and in that case prevents allocation of expansion ROM resources and
makes sure that the PCI core does not increase size of bridge windows.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
---
arch/x86/pci/fixup.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 51 insertions(+)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index f5809fa..924822b 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -7,6 +7,8 @@
#include <linux/pci.h>
#include <linux/init.h>
#include <linux/vgaarb.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
#include <asm/pci_x86.h>

static void pci_fixup_i450nx(struct pci_dev *d)
@@ -539,3 +541,52 @@ static void twinhead_reserve_killing_zone(struct pci_dev *dev)
}
}
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x27B9, twinhead_reserve_killing_zone);
+
+#ifdef CONFIG_ACPI
+/*
+ * BIOS assisted Thunderbolt PCI enumeration should handle all resource
+ * allocation on behalf of OS.
+ */
+static void quirk_thunderbolt(struct pci_dev *dev)
+{
+ struct acpi_pci_root *root;
+ acpi_handle handle;
+
+ handle = acpi_find_root_bridge_handle(dev);
+ if (!handle)
+ return;
+
+ root = acpi_pci_find_root(handle);
+ if (!root)
+ return;
+
+ /*
+ * Native PCIe hotplug should be disabled when BIOS assisted
+ * hotplug is in use.
+ */
+ if (root->osc_control_set & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)
+ return;
+
+ /*
+ * Make sure that we don't allocate resources for expansion ROMs.
+ * This may accidentally increase the size of the bridge window
+ * causing us to run out of resources.
+ */
+ if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
+ pr_info("Thunderbolt host router detected disabling ROMs\n");
+ pci_probe |= PCI_NOASSIGN_ROMS;
+ }
+
+ /*
+ * Don't add anything to the BIOS allocated bridge window size for
+ * the same reason.
+ */
+ dev->is_hotplug_bridge = 0;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1513, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x151a, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x151b, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1547, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1548, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1549, quirk_thunderbolt);
+#endif
--
1.8.3.1

2013-06-25 16:19:30

by Mika Westerberg

[permalink] [raw]
Subject: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

The acpiphp driver finds out whether the device is hotpluggable by checking
whether it has _RMV method behind it (and if it returns 1). However, at
least Acer Aspire S5 with Thunderbolt host router has this method placed
behind device called EPUP (endpoint upstream port?) and not directly behind
the root port as can be seen from the ASL code below:

Device (RP05)
{
...
Device (HRUP)
{
Name (_ADR, Zero)
Name (_PRW, Package (0x02)
{
0x09,
0x04
})
Device (HRDN)
{
Name (_ADR, 0x00040000)
Name (_PRW, Package (0x02)
{
0x09,
0x04
})
Device (EPUP)
{
Name (_ADR, Zero)
Method (_RMV, 0, NotSerialized)
{
Return (One)
}
}
}
}

If we want to support such machines we must look for the _RMV method a bit
deeper in the hierarchy. Fix this by changing pcihp_is_ejectable() to check
few more devices down from the root port.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
---
drivers/pci/hotplug/acpi_pcihp.c | 53 ++++++++++++++++++++++++++++++++++++----
1 file changed, 48 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/hotplug/acpi_pcihp.c b/drivers/pci/hotplug/acpi_pcihp.c
index 2a47e82..2760954 100644
--- a/drivers/pci/hotplug/acpi_pcihp.c
+++ b/drivers/pci/hotplug/acpi_pcihp.c
@@ -408,21 +408,64 @@ got_one:
}
EXPORT_SYMBOL(acpi_get_hp_hw_control_from_firmware);

+static acpi_status pcihp_evaluate_rmv(acpi_handle handle, u32 lvl,
+ void *context, void **return_not_used)
+{
+ unsigned long long *removable = context;
+ unsigned long long value;
+ acpi_status status;
+
+ status = acpi_evaluate_integer(handle, "_RMV", NULL, &value);
+ if (ACPI_SUCCESS(status) && value) {
+ *removable = value;
+ return AE_CTRL_TERMINATE;
+ }
+ return AE_OK;
+}
+
+/*
+ * pcihp_is_removable - check if this device is removable
+ * @handle: ACPI handle of the device
+ * @depth: how deep in the hierarchy we look for the _RMV
+ *
+ * Look for _RMV method behind the device. @depth specifies how deep in the
+ * hierarchy we search.
+ *
+ * Returns %true if the device is removable, %false otherwise.
+ */
+static bool pcihp_is_removable(acpi_handle handle, size_t depth)
+{
+ unsigned long long removable = 0;
+ acpi_status status;
+
+ status = pcihp_evaluate_rmv(handle, 0, &removable, NULL);
+ if ((status == AE_CTRL_TERMINATE) && removable)
+ return true;
+
+ acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, depth, pcihp_evaluate_rmv,
+ NULL, &removable, NULL);
+ return !!removable;
+}
+
+/*
+ * Some BIOSes, like the one in Acer Aspire S5 places the _RMV method a bit
+ * deeper in the hierarhcy. So check at least 3 levels behind this device.
+ */
+#define PCIHP_RMV_MAX_DEPTH 3
+
static int pcihp_is_ejectable(acpi_handle handle)
{
acpi_status status;
acpi_handle tmp;
- unsigned long long removable;
+
status = acpi_get_handle(handle, "_ADR", &tmp);
if (ACPI_FAILURE(status))
return 0;
status = acpi_get_handle(handle, "_EJ0", &tmp);
if (ACPI_SUCCESS(status))
return 1;
- status = acpi_evaluate_integer(handle, "_RMV", NULL, &removable);
- if (ACPI_SUCCESS(status) && removable)
- return 1;
- return 0;
+
+ return pcihp_is_removable(handle, PCIHP_RMV_MAX_DEPTH);
}

/**
--
1.8.3.1

2013-06-25 16:19:24

by Mika Westerberg

[permalink] [raw]
Subject: [PATCH 3/6] PCI: introduce pci_trim_stale_devices()

From: "Kirill A. Shutemov" <[email protected]>

The new function stops and removes the device if it doesn't respond.
If the device responds and it's a bus we apply the function to all
subdevices recursively.

It's useful for hotplug bus check case, when you need drop all unplugged
devices before looking for new ones.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
---
drivers/pci/remove.c | 20 ++++++++++++++++++++
include/linux/pci.h | 1 +
2 files changed, 21 insertions(+)

diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 8fc54b7..77b7a64 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -112,6 +112,26 @@ void pci_stop_and_remove_bus_device(struct pci_dev *dev)
}
EXPORT_SYMBOL(pci_stop_and_remove_bus_device);

+/**
+ * pci_trim_stale_devices - remove stale device or any stale child
+ */
+void pci_trim_stale_devices(struct pci_dev *dev)
+{
+ struct pci_bus *bus = dev->subordinate;
+ struct pci_dev *child, *tmp;
+ bool alive;
+ u32 l;
+
+ /* check if the device responds */
+ alive = pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &l, 0);
+ if (!alive)
+ pci_stop_and_remove_bus_device(dev);
+
+ if (alive && bus)
+ list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
+ pci_trim_stale_devices(child);
+}
+
void pci_stop_root_bus(struct pci_bus *bus)
{
struct pci_dev *child, *tmp;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3a24e4f..8f6e7a0 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -754,6 +754,7 @@ struct pci_dev *pci_dev_get(struct pci_dev *dev);
void pci_dev_put(struct pci_dev *dev);
void pci_remove_bus(struct pci_bus *b);
void pci_stop_and_remove_bus_device(struct pci_dev *dev);
+void pci_trim_stale_devices(struct pci_dev *dev);
void pci_stop_root_bus(struct pci_bus *bus);
void pci_remove_root_bus(struct pci_bus *bus);
void pci_setup_cardbus(struct pci_bus *bus);
--
1.8.3.1

2013-06-25 16:20:59

by Mika Westerberg

[permalink] [raw]
Subject: [PATCH 4/6] PCI: acpiphp: check for new devices on enabled host

From: "Kirill A. Shutemov" <[email protected]>

Current acpiphp_check_bridge() implementation is pretty dumb:
- it enables the slot if it's not enabled and the slot status is
ACPI_STA_ALL;
- it disables the slot if it's enabled and slot is not in ACPI_STA_ALL
state.

This behavior is not enough to handle Thunderbolt chaining case
properly. We need to actually rescan for new devices even if a device
has already in the slot.

The new implementation disables and stops the slot if it's not in
ACPI_STA_ALL state.

For ACPI_STA_ALL state we first trim devices which don't respond and
look for the ones after that. We do that even if slot already enabled
(SLOT_ENABLED).

Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
---
drivers/pci/hotplug/acpiphp_glue.c | 56 ++++++++++++++++++--------------------
1 file changed, 27 insertions(+), 29 deletions(-)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 80a6ea1..82a4ec9 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -868,43 +868,41 @@ int acpiphp_eject_slot(struct acpiphp_slot *slot)
* Iterate over all slots under this bridge and make sure that if a
* card is present they are enabled, and if not they are disabled.
*/
-static int acpiphp_check_bridge(struct acpiphp_bridge *bridge)
+static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
{
struct acpiphp_slot *slot;
- int retval = 0;
- int enabled, disabled;
-
- enabled = disabled = 0;

list_for_each_entry(slot, &bridge->slots, node) {
- unsigned int status = get_slot_status(slot);
- if (slot->flags & SLOT_ENABLED) {
- if (status == ACPI_STA_ALL)
- continue;
- retval = acpiphp_disable_slot(slot);
- if (retval) {
- err("Error occurred in disabling\n");
- goto err_exit;
- } else {
- acpiphp_eject_slot(slot);
+ struct pci_bus *bus = slot->bridge->pci_bus;
+ struct pci_dev *dev, *tmp;
+ int retval;
+
+ mutex_lock(&slot->crit_sect);
+ /* wake up all functions */
+ retval = power_on_slot(slot);
+ if (retval)
+ goto unlock;
+
+ if (get_slot_status(slot) == ACPI_STA_ALL) {
+ /* remove stale devices if any */
+ list_for_each_entry_safe(dev, tmp,
+ &bus->devices, bus_list) {
+ if (PCI_SLOT(dev->devfn) != slot->device)
+ continue;
+ pci_trim_stale_devices(dev);
}
- disabled++;
+
+ /* configure all functions */
+ retval = enable_device(slot);
+ if (retval)
+ power_off_slot(slot);
} else {
- if (status != ACPI_STA_ALL)
- continue;
- retval = acpiphp_enable_slot(slot);
- if (retval) {
- err("Error occurred in enabling\n");
- goto err_exit;
- }
- enabled++;
+ disable_device(slot);
+ power_off_slot(slot);
}
+unlock:
+ mutex_unlock(&slot->crit_sect);
}
-
- dbg("%s: %d enabled, %d disabled\n", __func__, enabled, disabled);
-
- err_exit:
- return retval;
}

static void acpiphp_set_hpp_values(struct pci_bus *bus)
--
1.8.3.1

2013-06-25 16:21:18

by Mika Westerberg

[permalink] [raw]
Subject: [PATCH 2/6] PCI: acpiphp: enable_device(): rescan even if no new devices on slot

From: "Kirill A. Shutemov" <[email protected]>

pci_scan_slot() returns number of new devices connected *directly*
connected to the slot. Current enable_device() checks the return value
and stops if it doesn't see a new device.

In Thunderbolt chaining case the new device can be deeper in hierarchy, so
do the rescan anyway.

Because of that we must make sure that pcibios_resource_survey_bus() and
check_hotplug_bridge() get called only for a just found bus and not the
ones already added to the system. Failure to do so will lead to resource
conflicts.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
---
drivers/pci/hotplug/acpiphp_glue.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index b983e29..80a6ea1 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -685,18 +685,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
struct pci_dev *dev;
struct pci_bus *bus = slot->bridge->pci_bus;
struct acpiphp_func *func;
- int num, max, pass;
+ int max, pass;
LIST_HEAD(add_list);

list_for_each_entry(func, &slot->funcs, sibling)
acpiphp_bus_add(func);

- num = pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
- if (num == 0) {
- /* Maybe only part of funcs are added. */
- dbg("No new device found\n");
- goto err_exit;
- }
+ pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));

max = acpiphp_max_busnr(bus);
for (pass = 0; pass < 2; pass++) {
@@ -707,8 +702,11 @@ static int __ref enable_device(struct acpiphp_slot *slot)
dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
max = pci_scan_bridge(bus, dev, max, pass);
if (pass && dev->subordinate) {
- check_hotplug_bridge(slot, dev);
- pcibios_resource_survey_bus(dev->subordinate);
+ if (!dev->subordinate->is_added) {
+ check_hotplug_bridge(slot, dev);
+ pcibios_resource_survey_bus(
+ dev->subordinate);
+ }
__pci_bus_size_bridges(dev->subordinate,
&add_list);
}
@@ -742,8 +740,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
}
}

-
- err_exit:
return 0;
}

--
1.8.3.1

2013-06-25 17:47:47

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 3/6] PCI: introduce pci_trim_stale_devices()

On Tue, Jun 25, 2013 at 7:22 PM, Mika Westerberg
<[email protected]> wrote:

> The new function stops and removes the device if it doesn't respond.
> If the device responds and it's a bus we apply the function to all
> subdevices recursively.
>
> It's useful for hotplug bus check case, when you need drop all unplugged
> devices before looking for new ones.

One comment below.

> +void pci_trim_stale_devices(struct pci_dev *dev)
> +{
> + struct pci_bus *bus = dev->subordinate;
> + struct pci_dev *child, *tmp;
> + bool alive;
> + u32 l;
> +
> + /* check if the device responds */
> + alive = pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &l, 0);
> + if (!alive)
> + pci_stop_and_remove_bus_device(dev);
> +
> + if (alive && bus)
> + list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
> + pci_trim_stale_devices(child);

It's not a tail call anyway, so, what about

if (!pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &l, 0)) {
pci_stop_and_remove_bus_device(dev);
} else if (bus) {
list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
pci_trim_stale_devices(child);
}

> +}

--
With Best Regards,
Andy Shevchenko

2013-06-25 17:53:40

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 3/6] PCI: introduce pci_trim_stale_devices()

Andy Shevchenko wrote:
> On Tue, Jun 25, 2013 at 7:22 PM, Mika Westerberg
> <[email protected]> wrote:
>
> > The new function stops and removes the device if it doesn't respond.
> > If the device responds and it's a bus we apply the function to all
> > subdevices recursively.
> >
> > It's useful for hotplug bus check case, when you need drop all unplugged
> > devices before looking for new ones.
>
> One comment below.
>
> > +void pci_trim_stale_devices(struct pci_dev *dev)
> > +{
> > + struct pci_bus *bus = dev->subordinate;
> > + struct pci_dev *child, *tmp;
> > + bool alive;
> > + u32 l;
> > +
> > + /* check if the device responds */
> > + alive = pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &l, 0);
> > + if (!alive)
> > + pci_stop_and_remove_bus_device(dev);
> > +
> > + if (alive && bus)
> > + list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
> > + pci_trim_stale_devices(child);
>
> It's not a tail call anyway, so, what about
>
> if (!pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &l, 0)) {
> pci_stop_and_remove_bus_device(dev);
> } else if (bus) {
> list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
> pci_trim_stale_devices(child);
> }

Why? It looks uglier.


--
Kirill A. Shutemov

2013-06-25 18:04:34

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 4/6] PCI: acpiphp: check for new devices on enabled host

On Tue, Jun 25, 2013 at 7:22 PM, Mika Westerberg
<[email protected]> wrote:

> Current acpiphp_check_bridge() implementation is pretty dumb:
> - it enables the slot if it's not enabled and the slot status is
> ACPI_STA_ALL;
> - it disables the slot if it's enabled and slot is not in ACPI_STA_ALL
> state.
>
> This behavior is not enough to handle Thunderbolt chaining case
> properly. We need to actually rescan for new devices even if a device
> has already in the slot.
>
> The new implementation disables and stops the slot if it's not in
> ACPI_STA_ALL state.
>
> For ACPI_STA_ALL state we first trim devices which don't respond and
> look for the ones after that. We do that even if slot already enabled
> (SLOT_ENABLED).

Just a couple of nitpicks below.

> list_for_each_entry(slot, &bridge->slots, node) {
> + struct pci_bus *bus = slot->bridge->pci_bus;
> + struct pci_dev *dev, *tmp;
> + int retval;
> +
> + mutex_lock(&slot->crit_sect);

Does it make sense to introduce a helper let's say
__acpiphp_check_slot() and put there all lines inside this mutex?

> + if (get_slot_status(slot) == ACPI_STA_ALL) {
> + /* remove stale devices if any */
> + list_for_each_entry_safe(dev, tmp,
> + &bus->devices, bus_list) {
> + if (PCI_SLOT(dev->devfn) != slot->device)
> + continue;
> + pci_trim_stale_devices(dev);

Perhaps

list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) {
if (PCI_SLOT(dev->devfn) == slot->device)
pci_trim_stale_devices(dev);
}

--
With Best Regards,
Andy Shevchenko

2013-06-25 18:15:49

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

On Tue, Jun 25, 2013 at 7:22 PM, Mika Westerberg
<[email protected]> wrote:

> The acpiphp driver finds out whether the device is hotpluggable by checking
> whether it has _RMV method behind it (and if it returns 1). However, at
> least Acer Aspire S5 with Thunderbolt host router has this method placed
> behind device called EPUP (endpoint upstream port?) and not directly behind
> the root port as can be seen from the ASL code below:

[]


> +static acpi_status pcihp_evaluate_rmv(acpi_handle handle, u32 lvl,
> + void *context, void **return_not_used)
> +{
> + unsigned long long *removable = context;
> + unsigned long long value;
> + acpi_status status;
> +
> + status = acpi_evaluate_integer(handle, "_RMV", NULL, &value);
> + if (ACPI_SUCCESS(status) && value) {

So, there is a chance the caller gets back uninitialized *context.
Let's assume that is by design.

> + *removable = value;
> + return AE_CTRL_TERMINATE;
> + }
> + return AE_OK;
> +}


> +static bool pcihp_is_removable(acpi_handle handle, size_t depth)
> +{
> + unsigned long long removable = 0;
> + acpi_status status;
> +
> + status = pcihp_evaluate_rmv(handle, 0, &removable, NULL);
> + if ((status == AE_CTRL_TERMINATE) && removable)

Here you already have removable not equal zero.

Internal braces could be removed, but it's up top you.

> + return true;
> +
> + acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, depth, pcihp_evaluate_rmv,
> + NULL, &removable, NULL);
> + return !!removable;
> +}


--
With Best Regards,
Andy Shevchenko

2013-06-25 18:27:47

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

On Tue, Jun 25, 2013 at 09:15:47PM +0300, Andy Shevchenko wrote:
> On Tue, Jun 25, 2013 at 7:22 PM, Mika Westerberg
> <[email protected]> wrote:
>
> > The acpiphp driver finds out whether the device is hotpluggable by checking
> > whether it has _RMV method behind it (and if it returns 1). However, at
> > least Acer Aspire S5 with Thunderbolt host router has this method placed
> > behind device called EPUP (endpoint upstream port?) and not directly behind
> > the root port as can be seen from the ASL code below:
>
> []
>
>
> > +static acpi_status pcihp_evaluate_rmv(acpi_handle handle, u32 lvl,
> > + void *context, void **return_not_used)
> > +{
> > + unsigned long long *removable = context;
> > + unsigned long long value;
> > + acpi_status status;
> > +
> > + status = acpi_evaluate_integer(handle, "_RMV", NULL, &value);
> > + if (ACPI_SUCCESS(status) && value) {
>
> So, there is a chance the caller gets back uninitialized *context.
> Let's assume that is by design.
>
> > + *removable = value;
> > + return AE_CTRL_TERMINATE;
> > + }
> > + return AE_OK;
> > +}
>
>
> > +static bool pcihp_is_removable(acpi_handle handle, size_t depth)
> > +{
> > + unsigned long long removable = 0;
> > + acpi_status status;
> > +
> > + status = pcihp_evaluate_rmv(handle, 0, &removable, NULL);
> > + if ((status == AE_CTRL_TERMINATE) && removable)
>
> Here you already have removable not equal zero.

Hmm, removable is initialized to zero just few lines above... Did I miss
something obvious?

> Internal braces could be removed, but it's up top you.

I'll remove them in the next revision.

Thanks.

2013-06-25 18:31:54

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

On Tue, Jun 25, 2013 at 9:31 PM, Mika Westerberg
<[email protected]> wrote:
> On Tue, Jun 25, 2013 at 09:15:47PM +0300, Andy Shevchenko wrote:
>> On Tue, Jun 25, 2013 at 7:22 PM, Mika Westerberg
>> <[email protected]> wrote:

[]

>> > +static acpi_status pcihp_evaluate_rmv(acpi_handle handle, u32 lvl,
>> > + void *context, void **return_not_used)
>> > +{
>> > + unsigned long long *removable = context;
>> > + unsigned long long value;
>> > + acpi_status status;
>> > +
>> > + status = acpi_evaluate_integer(handle, "_RMV", NULL, &value);
>> > + if (ACPI_SUCCESS(status) && value) {
>>
>> So, there is a chance the caller gets back uninitialized *context.
>> Let's assume that is by design.
>>
>> > + *removable = value;
>> > + return AE_CTRL_TERMINATE;
>> > + }
>> > + return AE_OK;
>> > +}
>>
>>
>> > +static bool pcihp_is_removable(acpi_handle handle, size_t depth)
>> > +{
>> > + unsigned long long removable = 0;
>> > + acpi_status status;
>> > +
>> > + status = pcihp_evaluate_rmv(handle, 0, &removable, NULL);
>> > + if ((status == AE_CTRL_TERMINATE) && removable)
>>
>> Here you already have removable not equal zero.
>
> Hmm, removable is initialized to zero just few lines above... Did I miss
> something obvious?

Yes, that's correct, however, you already did this check when you call
evaluate_rmv. Thus, second check '&& removable' is not needed.

--
With Best Regards,
Andy Shevchenko

2013-06-25 18:47:21

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

On Tue, Jun 25, 2013 at 09:31:52PM +0300, Andy Shevchenko wrote:
> On Tue, Jun 25, 2013 at 9:31 PM, Mika Westerberg
> <[email protected]> wrote:
> > On Tue, Jun 25, 2013 at 09:15:47PM +0300, Andy Shevchenko wrote:
> >> On Tue, Jun 25, 2013 at 7:22 PM, Mika Westerberg
> >> <[email protected]> wrote:
>
> []
>
> >> > +static acpi_status pcihp_evaluate_rmv(acpi_handle handle, u32 lvl,
> >> > + void *context, void **return_not_used)
> >> > +{
> >> > + unsigned long long *removable = context;
> >> > + unsigned long long value;
> >> > + acpi_status status;
> >> > +
> >> > + status = acpi_evaluate_integer(handle, "_RMV", NULL, &value);
> >> > + if (ACPI_SUCCESS(status) && value) {
> >>
> >> So, there is a chance the caller gets back uninitialized *context.
> >> Let's assume that is by design.
> >>
> >> > + *removable = value;
> >> > + return AE_CTRL_TERMINATE;
> >> > + }
> >> > + return AE_OK;
> >> > +}
> >>
> >>
> >> > +static bool pcihp_is_removable(acpi_handle handle, size_t depth)
> >> > +{
> >> > + unsigned long long removable = 0;
> >> > + acpi_status status;
> >> > +
> >> > + status = pcihp_evaluate_rmv(handle, 0, &removable, NULL);
> >> > + if ((status == AE_CTRL_TERMINATE) && removable)
> >>
> >> Here you already have removable not equal zero.
> >
> > Hmm, removable is initialized to zero just few lines above... Did I miss
> > something obvious?
>
> Yes, that's correct, however, you already did this check when you call
> evaluate_rmv. Thus, second check '&& removable' is not needed.

Ah, right. Got it now :-)

I think it is better to remove the first value check from the
pcihp_evaluate_rmv() and keep the check here.

I'll fix that in the next revision as well.

2013-06-25 19:30:11

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

On Tue, Jun 25, 2013 at 9:51 PM, Mika Westerberg
<[email protected]> wrote:
> On Tue, Jun 25, 2013 at 09:31:52PM +0300, Andy Shevchenko wrote:

> I think it is better to remove the first value check from the
> pcihp_evaluate_rmv() and keep the check here.
>
> I'll fix that in the next revision as well.

In that case you may remove the removable assignment as well.

--
With Best Regards,
Andy Shevchenko

2013-06-25 21:13:57

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Tue, 25 Jun 2013 19:22:10 +0300
Mika Westerberg <[email protected]> wrote:

> + if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> + pr_info("Thunderbolt host router detected disabling ROMs\n");
> + pci_probe |= PCI_NOASSIGN_ROMS;
> + }

I wonder if this should just be the default on x86? Or do we allocate
ROM space to address some other platform where we need it and the BIOS
doesn't do it for the devices we care about?

--
Jesse Barnes, Intel Open Source Technology Center

2013-06-25 23:14:58

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/6] Thunderbolt workarounds take 2

On Tuesday, June 25, 2013 07:22:04 PM Mika Westerberg wrote:
> Hi all,
>
> This is the second iteration of the patches first posted here quite a while
> ago:
>
> https://lkml.org/lkml/2012/12/13/589
>
> With Thunderbolt devices we might see pretty complex PCIe hierarchies that
> the current ACPI hotplug code was not able to handle properly but instead
> failed to scan the whole hierarchy. This series tries to address these
> problems. In addition to that there's a possibility that only part of the
> chain is disconnected so we don't want to leave behind stale PCI devices.
>
> We have tested this on Acer Aspire S5 with the largest chain looking like
> this:
>
> PC +--+ eSata Hub #0 +--+ eSata Hub #1 +--+ Apple Thunderbolt display +--+ Apple ethernet dongle
>
> There are still problems with different device drivers because of the
> surprise hotplug nature of Thunderbolt, causing occasional hangs and
> failures that should be fixed per-driver later on (not addressed in this
> series).
>
> Currently Thunderbolt on Apple hardware like Macbooks is not yet supported
> as they use some different mechanism than ACPI events to trigger hotplug
> events.
>
> The series is based on linux-pm.git/linux-next.
>
> Kirill A. Shutemov (4):
> PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()
> PCI: acpiphp: enable_device(): rescan even if no new devices on slot
> PCI: introduce pci_trim_stale_devices()
> PCI: acpiphp: check for new devices on enabled host
>
> Mika Westerberg (2):
> PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy
> x86/PCI: quirk Thunderbolt PCI-to-PCI bridges
>
> arch/x86/pci/fixup.c | 51 ++++++++++++++++++++++++
> drivers/pci/hotplug/acpi_pcihp.c | 53 ++++++++++++++++++++++---
> drivers/pci/hotplug/acpiphp_glue.c | 79 +++++++++++++++++---------------------
> drivers/pci/remove.c | 20 ++++++++++
> include/linux/pci.h | 1 +
> 5 files changed, 156 insertions(+), 48 deletions(-)

Would you please CC things touching ACPIPHP to linux-acpi?

Rafael

2013-06-26 07:21:55

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 0/6] Thunderbolt workarounds take 2

On Wed, Jun 26, 2013 at 01:24:25AM +0200, Rafael J. Wysocki wrote:
> Would you please CC things touching ACPIPHP to linux-acpi?

Sorry about that -- It wasn't listed in the output of get_maintainers.pl so
it kind of was forgotten.

Do you want me to resend this series with linux-acpi@ included?

2013-06-26 09:37:14

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 4/6] PCI: acpiphp: check for new devices on enabled host

Andy Shevchenko wrote:
> On Tue, Jun 25, 2013 at 7:22 PM, Mika Westerberg
> <[email protected]> wrote:
>
> > Current acpiphp_check_bridge() implementation is pretty dumb:
> > - it enables the slot if it's not enabled and the slot status is
> > ACPI_STA_ALL;
> > - it disables the slot if it's enabled and slot is not in ACPI_STA_ALL
> > state.
> >
> > This behavior is not enough to handle Thunderbolt chaining case
> > properly. We need to actually rescan for new devices even if a device
> > has already in the slot.
> >
> > The new implementation disables and stops the slot if it's not in
> > ACPI_STA_ALL state.
> >
> > For ACPI_STA_ALL state we first trim devices which don't respond and
> > look for the ones after that. We do that even if slot already enabled
> > (SLOT_ENABLED).
>
> Just a couple of nitpicks below.
>
> > list_for_each_entry(slot, &bridge->slots, node) {
> > + struct pci_bus *bus = slot->bridge->pci_bus;
> > + struct pci_dev *dev, *tmp;
> > + int retval;
> > +
> > + mutex_lock(&slot->crit_sect);
>
> Does it make sense to introduce a helper let's say
> __acpiphp_check_slot() and put there all lines inside this mutex?

No. Why?

> > + if (get_slot_status(slot) == ACPI_STA_ALL) {
> > + /* remove stale devices if any */
> > + list_for_each_entry_safe(dev, tmp,
> > + &bus->devices, bus_list) {
> > + if (PCI_SLOT(dev->devfn) != slot->device)
> > + continue;
> > + pci_trim_stale_devices(dev);
>
> Perhaps
>
> list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) {
> if (PCI_SLOT(dev->devfn) == slot->device)
> pci_trim_stale_devices(dev);
> }

Makes sense, thanks.

--
Kirill A. Shutemov

2013-06-26 12:14:17

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
> On Tue, 25 Jun 2013 19:22:10 +0300
> Mika Westerberg <[email protected]> wrote:
>
> > + if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> > + pr_info("Thunderbolt host router detected disabling ROMs\n");
> > + pci_probe |= PCI_NOASSIGN_ROMS;
> > + }
>
> I wonder if this should just be the default on x86? Or do we allocate
> ROM space to address some other platform where we need it and the BIOS
> doesn't do it for the devices we care about?

Good question. In our case it definitely helps to have pci=norom the
default. Can't tell if it might break something that depends on the current
behaviour.

Bjorn, Greg, Rafael,

What do you think?

2013-06-26 12:36:14

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/6] Thunderbolt workarounds take 2

On Wednesday, June 26, 2013 10:25:36 AM Mika Westerberg wrote:
> On Wed, Jun 26, 2013 at 01:24:25AM +0200, Rafael J. Wysocki wrote:
> > Would you please CC things touching ACPIPHP to linux-acpi?
>
> Sorry about that -- It wasn't listed in the output of get_maintainers.pl so
> it kind of was forgotten.
>
> Do you want me to resend this series with linux-acpi@ included?

No need, just please remember about that in the future.

Thanks,
Rafael

2013-06-26 15:03:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 03:17:57PM +0300, Mika Westerberg wrote:
> On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
> > On Tue, 25 Jun 2013 19:22:10 +0300
> > Mika Westerberg <[email protected]> wrote:
> >
> > > + if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> > > + pr_info("Thunderbolt host router detected disabling ROMs\n");
> > > + pci_probe |= PCI_NOASSIGN_ROMS;
> > > + }
> >
> > I wonder if this should just be the default on x86? Or do we allocate
> > ROM space to address some other platform where we need it and the BIOS
> > doesn't do it for the devices we care about?
>
> Good question. In our case it definitely helps to have pci=norom the
> default. Can't tell if it might break something that depends on the current
> behaviour.
>
> Bjorn, Greg, Rafael,
>
> What do you think?

I can't recall any specific reason to not do this, so no objection from
me, but make it a nice and small patch that can easily be reverted if
problems show up in the wild :)

thanks,

greg k-h

2013-06-26 19:48:45

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/6] Thunderbolt workarounds take 2

On Wed, Jun 26, 2013 at 02:45:40PM +0200, Rafael J. Wysocki wrote:
> On Wednesday, June 26, 2013 10:25:36 AM Mika Westerberg wrote:
> > On Wed, Jun 26, 2013 at 01:24:25AM +0200, Rafael J. Wysocki wrote:
> > > Would you please CC things touching ACPIPHP to linux-acpi?
> >
> > Sorry about that -- It wasn't listed in the output of get_maintainers.pl so
> > it kind of was forgotten.
> >
> > Do you want me to resend this series with linux-acpi@ included?
>
> No need, just please remember about that in the future.

Maybe we should do the following? That covers the following files under
drivers/pci:

drivers/pci/hotplug/pciehp_acpi.c
drivers/pci/hotplug/acpi_pcihp.c
drivers/pci/hotplug/acpiphp_glue.c
drivers/pci/hotplug/acpiphp_core.c
drivers/pci/hotplug/acpiphp_ibm.c
drivers/pci/pcie/aer/aerdrv_acpi.c
drivers/pci/pcie/portdrv_acpi.c
drivers/pci/pci-acpi.c


commit 961e3992849e18f9ed6cbcac5c54fb29925341b3
Author: Bjorn Helgaas <[email protected]>
Date: Wed Jun 26 13:38:37 2013 -0600

MAINTAINERS: Add ACPI folks for ACPI-related things under drivers/pci

Add file patterns so get_maintainers.pl reports both PCI and ACPI folks
for ACPI-related things in drivers/pci.

Signed-off-by: Bjorn Helgaas <[email protected]>

diff --git a/MAINTAINERS b/MAINTAINERS
index 3d7782b..ce65550 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -242,6 +242,9 @@ F: drivers/acpi/
F: drivers/pnp/pnpacpi/
F: include/linux/acpi.h
F: include/acpi/
+F: drivers/pci/*acpi*
+F: drivers/pci/*/*acpi*
+F: drivers/pci/*/*/*acpi*

ACPI FAN DRIVER
M: Zhang Rui <[email protected]>

2013-06-26 19:52:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/6] Thunderbolt workarounds take 2

On Wednesday, June 26, 2013 01:48:39 PM Bjorn Helgaas wrote:
> On Wed, Jun 26, 2013 at 02:45:40PM +0200, Rafael J. Wysocki wrote:
> > On Wednesday, June 26, 2013 10:25:36 AM Mika Westerberg wrote:
> > > On Wed, Jun 26, 2013 at 01:24:25AM +0200, Rafael J. Wysocki wrote:
> > > > Would you please CC things touching ACPIPHP to linux-acpi?
> > >
> > > Sorry about that -- It wasn't listed in the output of get_maintainers.pl so
> > > it kind of was forgotten.
> > >
> > > Do you want me to resend this series with linux-acpi@ included?
> >
> > No need, just please remember about that in the future.
>
> Maybe we should do the following? That covers the following files under
> drivers/pci:
>
> drivers/pci/hotplug/pciehp_acpi.c
> drivers/pci/hotplug/acpi_pcihp.c
> drivers/pci/hotplug/acpiphp_glue.c
> drivers/pci/hotplug/acpiphp_core.c
> drivers/pci/hotplug/acpiphp_ibm.c
> drivers/pci/pcie/aer/aerdrv_acpi.c
> drivers/pci/pcie/portdrv_acpi.c
> drivers/pci/pci-acpi.c
>
>
> commit 961e3992849e18f9ed6cbcac5c54fb29925341b3
> Author: Bjorn Helgaas <[email protected]>
> Date: Wed Jun 26 13:38:37 2013 -0600
>
> MAINTAINERS: Add ACPI folks for ACPI-related things under drivers/pci
>
> Add file patterns so get_maintainers.pl reports both PCI and ACPI folks
> for ACPI-related things in drivers/pci.
>
> Signed-off-by: Bjorn Helgaas <[email protected]>

Acked-by: Rafael J. Wysocki <[email protected]>

> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3d7782b..ce65550 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -242,6 +242,9 @@ F: drivers/acpi/
> F: drivers/pnp/pnpacpi/
> F: include/linux/acpi.h
> F: include/acpi/
> +F: drivers/pci/*acpi*
> +F: drivers/pci/*/*acpi*
> +F: drivers/pci/*/*/*acpi*
>
> ACPI FAN DRIVER
> M: Zhang Rui <[email protected]>
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-06-26 19:55:52

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/6] Thunderbolt workarounds take 2

On Wed, Jun 26, 2013 at 2:01 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Wednesday, June 26, 2013 01:48:39 PM Bjorn Helgaas wrote:
>> On Wed, Jun 26, 2013 at 02:45:40PM +0200, Rafael J. Wysocki wrote:
>> > On Wednesday, June 26, 2013 10:25:36 AM Mika Westerberg wrote:
>> > > On Wed, Jun 26, 2013 at 01:24:25AM +0200, Rafael J. Wysocki wrote:
>> > > > Would you please CC things touching ACPIPHP to linux-acpi?
>> > >
>> > > Sorry about that -- It wasn't listed in the output of get_maintainers.pl so
>> > > it kind of was forgotten.
>> > >
>> > > Do you want me to resend this series with linux-acpi@ included?
>> >
>> > No need, just please remember about that in the future.
>>
>> Maybe we should do the following? That covers the following files under
>> drivers/pci:
>>
>> drivers/pci/hotplug/pciehp_acpi.c
>> drivers/pci/hotplug/acpi_pcihp.c
>> drivers/pci/hotplug/acpiphp_glue.c
>> drivers/pci/hotplug/acpiphp_core.c
>> drivers/pci/hotplug/acpiphp_ibm.c
>> drivers/pci/pcie/aer/aerdrv_acpi.c
>> drivers/pci/pcie/portdrv_acpi.c
>> drivers/pci/pci-acpi.c
>>
>>
>> commit 961e3992849e18f9ed6cbcac5c54fb29925341b3
>> Author: Bjorn Helgaas <[email protected]>
>> Date: Wed Jun 26 13:38:37 2013 -0600
>>
>> MAINTAINERS: Add ACPI folks for ACPI-related things under drivers/pci
>>
>> Add file patterns so get_maintainers.pl reports both PCI and ACPI folks
>> for ACPI-related things in drivers/pci.
>>
>> Signed-off-by: Bjorn Helgaas <[email protected]>
>
> Acked-by: Rafael J. Wysocki <[email protected]>

Added to my pci/misc branch for v3.11.

>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 3d7782b..ce65550 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -242,6 +242,9 @@ F: drivers/acpi/
>> F: drivers/pnp/pnpacpi/
>> F: include/linux/acpi.h
>> F: include/acpi/
>> +F: drivers/pci/*acpi*
>> +F: drivers/pci/*/*acpi*
>> +F: drivers/pci/*/*/*acpi*
>>
>> ACPI FAN DRIVER
>> M: Zhang Rui <[email protected]>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

2013-06-26 21:00:23

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

[+cc Alex]

On Wed, Jun 26, 2013 at 6:17 AM, Mika Westerberg
<[email protected]> wrote:
> On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
>> On Tue, 25 Jun 2013 19:22:10 +0300
>> Mika Westerberg <[email protected]> wrote:
>>
>> > + if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
>> > + pr_info("Thunderbolt host router detected disabling ROMs\n");
>> > + pci_probe |= PCI_NOASSIGN_ROMS;
>> > + }
>>
>> I wonder if this should just be the default on x86? Or do we allocate
>> ROM space to address some other platform where we need it and the BIOS
>> doesn't do it for the devices we care about?
>
> Good question. In our case it definitely helps to have pci=norom the
> default. Can't tell if it might break something that depends on the current
> behaviour.

I think the current default behavior is that if the BIOS has assigned
the ROM BAR, we keep that assignment, and if it hasn't, we allocate
MEM space for it. And "pci=norom" means that we don't allocate MEM
space for it, even if the BIOS hasn't assigned it.

"pci=norom" is only implemented on x86. I think most other arches
allocate MEM space for ROMs, with no way to turn that off. PA-RISC
seems to ignore ROMs (dino_fixup_bus()), but that looks like the
exception.

I'm slightly concerned that if we make the x86 default be "never
assign space for ROMs unless the BIOS has done it," we might break
virtualized guests that need access to ROMs. Alex?

Bjorn

2013-06-26 22:15:17

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, 2013-06-26 at 14:59 -0600, Bjorn Helgaas wrote:
> [+cc Alex]
>
> On Wed, Jun 26, 2013 at 6:17 AM, Mika Westerberg
> <[email protected]> wrote:
> > On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
> >> On Tue, 25 Jun 2013 19:22:10 +0300
> >> Mika Westerberg <[email protected]> wrote:
> >>
> >> > + if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> >> > + pr_info("Thunderbolt host router detected disabling ROMs\n");
> >> > + pci_probe |= PCI_NOASSIGN_ROMS;
> >> > + }
> >>
> >> I wonder if this should just be the default on x86? Or do we allocate
> >> ROM space to address some other platform where we need it and the BIOS
> >> doesn't do it for the devices we care about?
> >
> > Good question. In our case it definitely helps to have pci=norom the
> > default. Can't tell if it might break something that depends on the current
> > behaviour.
>
> I think the current default behavior is that if the BIOS has assigned
> the ROM BAR, we keep that assignment, and if it hasn't, we allocate
> MEM space for it. And "pci=norom" means that we don't allocate MEM
> space for it, even if the BIOS hasn't assigned it.
>
> "pci=norom" is only implemented on x86. I think most other arches
> allocate MEM space for ROMs, with no way to turn that off. PA-RISC
> seems to ignore ROMs (dino_fixup_bus()), but that looks like the
> exception.
>
> I'm slightly concerned that if we make the x86 default be "never
> assign space for ROMs unless the BIOS has done it," we might break
> virtualized guests that need access to ROMs. Alex?

Yep, I'm more than slightly concerned by that too. Whenever possible we
want to pass the ROM to the guest since it may end up being a boot
device or drivers within the guest may require it. We can pass the ROM
to the guest from an image file, but that requires someplace from which
to dump the image, which is usually PCI sysfs. Thanks,

Alex

2013-06-26 22:19:16

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
<[email protected]> wrote:
> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
> This means that the BIOS will allocate bridge resources based on some
> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
> hotplug of the root port where the Thunderbolt host router is connected.

The BIOS often assigns PCI bridge windows, e.g., for root ports
leading to ExpressCard slots. I assume BIOSes make similar
assumptions there about what ExpressCards are likely to be plugged in.
Is your BIOS assistance the same sort of thing, or is there something
additional happening at hot-add time? (I think there might be AML
that does Thunderbolt-specific stuff at hotplug-time, but I assume
that's not related to resource assignment, because the OS owns those
resources.)

> In order to support this we must make sure that the kernel does not try to
> be too smart and resize / open the bridge windows during PCI enumeration.
> For example by default the kernel will add certain amount of space to the
> bridge memory/io windows (this is configurable via pci=hp[mem|io]size=xxx
> command line option). Eventually we run out of space that the BIOS has
> allocated.

ROMs usually aren't very big compared to regular memory BARs. Is the
problem just that adding space the BIOS didn't plan for causes the OS
to increase the window size and bump into space assigned to a peer of
the Thunderbolt controller?

> Also address space for expansion ROMs should not be allocated (BIOS does
> not execute them for Thunderbolt endpoints). If we don't prevent this the
> kernel might find expansion ROM associated with some endpoint and reopen
> the bridge window which the BIOS already closed leading again resource
> exhaustion.

I assume the only reason we should not allocate expansion ROM space is
to avoid resource exhaustion. If we had enough resources, allocating
ROM space wouldn't cause anything bad to happen, would it?

> Fix this by adding a quirk that matches known Thunderbolt PCI-to-PCI
> bridges and in that case prevents allocation of expansion ROM resources and
> makes sure that the PCI core does not increase size of bridge windows.
>
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> Signed-off-by: Mika Westerberg <[email protected]>
> ---
> arch/x86/pci/fixup.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 51 insertions(+)
>
> diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> index f5809fa..924822b 100644
> --- a/arch/x86/pci/fixup.c
> +++ b/arch/x86/pci/fixup.c
> @@ -7,6 +7,8 @@
> #include <linux/pci.h>
> #include <linux/init.h>
> #include <linux/vgaarb.h>
> +#include <linux/acpi.h>
> +#include <linux/pci-acpi.h>
> #include <asm/pci_x86.h>
>
> static void pci_fixup_i450nx(struct pci_dev *d)
> @@ -539,3 +541,52 @@ static void twinhead_reserve_killing_zone(struct pci_dev *dev)
> }
> }
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x27B9, twinhead_reserve_killing_zone);
> +
> +#ifdef CONFIG_ACPI
> +/*
> + * BIOS assisted Thunderbolt PCI enumeration should handle all resource
> + * allocation on behalf of OS.
> + */
> +static void quirk_thunderbolt(struct pci_dev *dev)
> +{
> + struct acpi_pci_root *root;
> + acpi_handle handle;
> +
> + handle = acpi_find_root_bridge_handle(dev);
> + if (!handle)
> + return;
> +
> + root = acpi_pci_find_root(handle);
> + if (!root)
> + return;
> +
> + /*
> + * Native PCIe hotplug should be disabled when BIOS assisted
> + * hotplug is in use.
> + */
> + if (root->osc_control_set & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)
> + return;

I'm not really sure why this test is here. I dimly recall hearing
that Thunderbolt hotplug requires some work at hotplug-time, and this
is not all public, and hence is done by AML. But that in itself
doesn't seem related to the question of allocating ROM space.

> + /*
> + * Make sure that we don't allocate resources for expansion ROMs.
> + * This may accidentally increase the size of the bridge window
> + * causing us to run out of resources.
> + */
> + if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> + pr_info("Thunderbolt host router detected disabling ROMs\n");

We have a pci_dev, so this should use dev_info().

> + pci_probe |= PCI_NOASSIGN_ROMS;

I think you really only care about the space for ROMs on devices
connected via Thunderbolt, so it's kind of a shame that the switch is
global and affects ROMs on *all* devices. I guess there's nothing
simple to be done about that, though.

> + }
> +
> + /*
> + * Don't add anything to the BIOS allocated bridge window size for
> + * the same reason.
> + */
> + dev->is_hotplug_bridge = 0;

"is_hotplug_bridge" is also used in MPS management
(pcie_find_smpss()). Did you investigate that and make sure this is
safe? I think the Thunderbolt controller supports hotplug, and I'm
not sure what will happen if the MPS code assumes it doesn't.

> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1513, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x151a, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x151b, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1547, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1548, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1549, quirk_thunderbolt);
> +#endif
> --
> 1.8.3.1
>

2013-06-26 22:27:01

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <[email protected]> wrote:
> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> <[email protected]> wrote:
>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>> This means that the BIOS will allocate bridge resources based on some
>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>> hotplug of the root port where the Thunderbolt host router is connected.

We should not need tricks in this patch after

https://patchwork.kernel.org/patch/2766521/

[2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug

Thanks

Yinghai

2013-06-26 22:31:20

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <[email protected]> wrote:
> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <[email protected]> wrote:
>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>> <[email protected]> wrote:
>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>> This means that the BIOS will allocate bridge resources based on some
>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>> hotplug of the root port where the Thunderbolt host router is connected.
>
> We should not need tricks in this patch after
>
> https://patchwork.kernel.org/patch/2766521/
>
> [2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug
>

BTW, Rafael, looks like that "boot-time" is not accurate here.

During acpi hotplug, firmare could do extra help for us like assign
some resources to pci device bars, so it is NOT "boot-time".

Yinghai

2013-06-26 22:35:05

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wednesday, June 26, 2013 03:31:18 PM Yinghai Lu wrote:
> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <[email protected]> wrote:
> > On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <[email protected]> wrote:
> >> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> >> <[email protected]> wrote:
> >>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
> >>> This means that the BIOS will allocate bridge resources based on some
> >>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
> >>> hotplug of the root port where the Thunderbolt host router is connected.
> >
> > We should not need tricks in this patch after
> >
> > https://patchwork.kernel.org/patch/2766521/
> >
> > [2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug
> >
>
> BTW, Rafael, looks like that "boot-time" is not accurate here.
>
> During acpi hotplug, firmare could do extra help for us like assign
> some resources to pci device bars, so it is NOT "boot-time".

Well, Linus has merged it already and besides "boot-time rules" need not
mean "boot-time resources", right?

Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-06-26 22:38:53

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 3:44 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Wednesday, June 26, 2013 03:31:18 PM Yinghai Lu wrote:
>> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <[email protected]> wrote:
>> > On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <[email protected]> wrote:
>> >> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>> >> <[email protected]> wrote:
>> >>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>> >>> This means that the BIOS will allocate bridge resources based on some
>> >>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>> >>> hotplug of the root port where the Thunderbolt host router is connected.
>> >
>> > We should not need tricks in this patch after
>> >
>> > https://patchwork.kernel.org/patch/2766521/
>> >
>> > [2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug
>> >
>>
>> BTW, Rafael, looks like that "boot-time" is not accurate here.
>>
>> During acpi hotplug, firmare could do extra help for us like assign
>> some resources to pci device bars, so it is NOT "boot-time".
>
> Well, Linus has merged it already and besides "boot-time rules" need not
> mean "boot-time resources", right?

ok.

2013-06-26 22:55:28

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 4:31 PM, Yinghai Lu <[email protected]> wrote:
> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <[email protected]> wrote:
>> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <[email protected]> wrote:
>>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>>> <[email protected]> wrote:
>>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>>> This means that the BIOS will allocate bridge resources based on some
>>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>>> hotplug of the root port where the Thunderbolt host router is connected.
> ...
> During acpi hotplug, firmare could do extra help for us like assign
> some resources to pci device bars, so it is NOT "boot-time".

Really? How can firmware assign BARs at hotplug-time? I mean,
obviously firmware *can* write things to the BARs before giving the
device to the OS, but how would it know what to write? I assume the
OS owns the address space, and it can change the upstream bridge
windows or the BARs of another device on the bus at any time, subject
to the OS's own issues as far as quiescing or unbinding drivers, etc.,
but without coordinating with the BIOS.

Bjorn

2013-06-26 23:28:53

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
<[email protected]> wrote:
> From: "Kirill A. Shutemov" <[email protected]>
>
> With Thunderbolt you can chain devices: connect a new devices to plugged
> one. In this case the slot is already enabled, but we still want to look
> for new devices behind it.
>
> We're going to reuse enable_device() for rescan for new devices on the
> enabled slot. Let's push the check up by stack.
>
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> Signed-off-by: Mika Westerberg <[email protected]>
> ---
> drivers/pci/hotplug/acpiphp_glue.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index 59df857..b983e29 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -688,9 +688,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> int num, max, pass;
> LIST_HEAD(add_list);
>
> - if (slot->flags & SLOT_ENABLED)
> - goto err_exit;
> -
> list_for_each_entry(func, &slot->funcs, sibling)
> acpiphp_bus_add(func);
>
> @@ -1242,6 +1239,8 @@ int acpiphp_enable_slot(struct acpiphp_slot *slot)
> goto err_exit;
>
> if (get_slot_status(slot) == ACPI_STA_ALL) {
> + if (slot->flags & SLOT_ENABLED)
> + goto err_exit;

Why do we check for SLOT_ENABLED at all? I think we're handling a Bus
Check notification, which means "re-enumerate on the device tree
starting from the notification point." It doesn't say anything about
skipping the re-enumeration if we find a device that's already
enabled.

It seems like we ought to just re-enumerate all the way down in case a
device was added farther down in the tree (which is what it sounds
like Thunderbolt is doing).

> /* configure all functions */
> retval = enable_device(slot);
> if (retval)
> --
> 1.8.3.1
>

2013-06-26 23:38:21

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 2/6] PCI: acpiphp: enable_device(): rescan even if no new devices on slot

On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
<[email protected]> wrote:
> From: "Kirill A. Shutemov" <[email protected]>
>
> pci_scan_slot() returns number of new devices connected *directly*
> connected to the slot. Current enable_device() checks the return value
> and stops if it doesn't see a new device.
>
> In Thunderbolt chaining case the new device can be deeper in hierarchy, so
> do the rescan anyway.
>
> Because of that we must make sure that pcibios_resource_survey_bus() and
> check_hotplug_bridge() get called only for a just found bus and not the
> ones already added to the system. Failure to do so will lead to resource
> conflicts.
>
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> Signed-off-by: Mika Westerberg <[email protected]>
> ---
> drivers/pci/hotplug/acpiphp_glue.c | 18 +++++++-----------
> 1 file changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index b983e29..80a6ea1 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -685,18 +685,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> struct pci_dev *dev;
> struct pci_bus *bus = slot->bridge->pci_bus;
> struct acpiphp_func *func;
> - int num, max, pass;
> + int max, pass;
> LIST_HEAD(add_list);
>
> list_for_each_entry(func, &slot->funcs, sibling)
> acpiphp_bus_add(func);
>
> - num = pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
> - if (num == 0) {
> - /* Maybe only part of funcs are added. */
> - dbg("No new device found\n");
> - goto err_exit;
> - }
> + pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
>
> max = acpiphp_max_busnr(bus);
> for (pass = 0; pass < 2; pass++) {

I think this is two logical changes: the change below looks like it
could by done by itself first, followed by the change above.

> @@ -707,8 +702,11 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
> max = pci_scan_bridge(bus, dev, max, pass);
> if (pass && dev->subordinate) {
> - check_hotplug_bridge(slot, dev);
> - pcibios_resource_survey_bus(dev->subordinate);
> + if (!dev->subordinate->is_added) {
> + check_hotplug_bridge(slot, dev);
> + pcibios_resource_survey_bus(
> + dev->subordinate);

It's a shame that pcibios_resource_survey_bus() can't be called twice.
It would be nice if it were smart enough to notice on the second call
that "oh, resources have already been assigned, so there's nothing
more to be done." Did you look to see whether that's feasible?

> + }
> __pci_bus_size_bridges(dev->subordinate,
> &add_list);
> }
> @@ -742,8 +740,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> }
> }
>
> -
> - err_exit:
> return 0;
> }
>
> --
> 1.8.3.1
>

2013-06-26 23:56:06

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 3:55 PM, Bjorn Helgaas <[email protected]> wrote:
> On Wed, Jun 26, 2013 at 4:31 PM, Yinghai Lu <[email protected]> wrote:
>> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <[email protected]> wrote:
>>> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <[email protected]> wrote:
>>>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>>>> <[email protected]> wrote:
>>>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>>>> This means that the BIOS will allocate bridge resources based on some
>>>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>>>> hotplug of the root port where the Thunderbolt host router is connected.
>> ...
>> During acpi hotplug, firmare could do extra help for us like assign
>> some resources to pci device bars, so it is NOT "boot-time".
>
> Really? How can firmware assign BARs at hotplug-time? I mean,
> obviously firmware *can* write things to the BARs before giving the
> device to the OS, but how would it know what to write?

should be acpi code, or SMI code or even BMC firmware via sideband.

> I assume the
> OS owns the address space, and it can change the upstream bridge
> windows or the BARs of another device on the bus at any time, subject
> to the OS's own issues as far as quiescing or unbinding drivers, etc.,
> but without coordinating with the BIOS.

for thunderbolt or dock with acpiphp, then all children devices/bridges should
not have drivers loaded yet.

Yinghai

2013-06-27 01:20:26

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 2/6] PCI: acpiphp: enable_device(): rescan even if no new devices on slot

On Tue, Jun 25, 2013 at 9:22 AM, Mika Westerberg
<[email protected]> wrote:
> From: "Kirill A. Shutemov" <[email protected]>
>
> pci_scan_slot() returns number of new devices connected *directly*
> connected to the slot. Current enable_device() checks the return value
> and stops if it doesn't see a new device.
>
> In Thunderbolt chaining case the new device can be deeper in hierarchy, so
> do the rescan anyway.
>
> Because of that we must make sure that pcibios_resource_survey_bus() and
> check_hotplug_bridge() get called only for a just found bus and not the
> ones already added to the system. Failure to do so will lead to resource
> conflicts.
>
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> Signed-off-by: Mika Westerberg <[email protected]>
> ---
> drivers/pci/hotplug/acpiphp_glue.c | 18 +++++++-----------
> 1 file changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index b983e29..80a6ea1 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -685,18 +685,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> struct pci_dev *dev;
> struct pci_bus *bus = slot->bridge->pci_bus;
> struct acpiphp_func *func;
> - int num, max, pass;
> + int max, pass;
> LIST_HEAD(add_list);
>
> list_for_each_entry(func, &slot->funcs, sibling)
> acpiphp_bus_add(func);
>
> - num = pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
> - if (num == 0) {
> - /* Maybe only part of funcs are added. */
> - dbg("No new device found\n");
> - goto err_exit;
> - }
> + pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
>
> max = acpiphp_max_busnr(bus);
> for (pass = 0; pass < 2; pass++) {
> @@ -707,8 +702,11 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
> max = pci_scan_bridge(bus, dev, max, pass);
> if (pass && dev->subordinate) {
> - check_hotplug_bridge(slot, dev);
> - pcibios_resource_survey_bus(dev->subordinate);
> + if (!dev->subordinate->is_added) {
> + check_hotplug_bridge(slot, dev);
> + pcibios_resource_survey_bus(
> + dev->subordinate);
> + }

No,
this change will totally disable calling
check_hotplug_bridge/pcibios_resource_survey_bus()

as pci_scan_bridge/pci_scan_child_bus will set dev->subordinate->is_added...

Please note: recently is_added setting is moved early before
pci_bus_add_devices...

> __pci_bus_size_bridges(dev->subordinate,
> &add_list);
> }
> @@ -742,8 +740,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> }
> }
>
> -
> - err_exit:
> return 0;
> }
>
> --
> 1.8.3.1
>

2013-06-27 12:58:35

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 2/6] PCI: acpiphp: enable_device(): rescan even if no new devices on slot

On Wed, Jun 26, 2013 at 05:37:58PM -0600, Bjorn Helgaas wrote:
> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> <[email protected]> wrote:
> > From: "Kirill A. Shutemov" <[email protected]>
> >
> > pci_scan_slot() returns number of new devices connected *directly*
> > connected to the slot. Current enable_device() checks the return value
> > and stops if it doesn't see a new device.
> >
> > In Thunderbolt chaining case the new device can be deeper in hierarchy, so
> > do the rescan anyway.
> >
> > Because of that we must make sure that pcibios_resource_survey_bus() and
> > check_hotplug_bridge() get called only for a just found bus and not the
> > ones already added to the system. Failure to do so will lead to resource
> > conflicts.
> >
> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > Signed-off-by: Mika Westerberg <[email protected]>
> > ---
> > drivers/pci/hotplug/acpiphp_glue.c | 18 +++++++-----------
> > 1 file changed, 7 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > index b983e29..80a6ea1 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -685,18 +685,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> > struct pci_dev *dev;
> > struct pci_bus *bus = slot->bridge->pci_bus;
> > struct acpiphp_func *func;
> > - int num, max, pass;
> > + int max, pass;
> > LIST_HEAD(add_list);
> >
> > list_for_each_entry(func, &slot->funcs, sibling)
> > acpiphp_bus_add(func);
> >
> > - num = pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
> > - if (num == 0) {
> > - /* Maybe only part of funcs are added. */
> > - dbg("No new device found\n");
> > - goto err_exit;
> > - }
> > + pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
> >
> > max = acpiphp_max_busnr(bus);
> > for (pass = 0; pass < 2; pass++) {
>
> I think this is two logical changes: the change below looks like it
> could by done by itself first, followed by the change above.

Indeed. We'll going to drop the second change completely.

>
> > @@ -707,8 +702,11 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> > dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
> > max = pci_scan_bridge(bus, dev, max, pass);
> > if (pass && dev->subordinate) {
> > - check_hotplug_bridge(slot, dev);
> > - pcibios_resource_survey_bus(dev->subordinate);
> > + if (!dev->subordinate->is_added) {
> > + check_hotplug_bridge(slot, dev);
> > + pcibios_resource_survey_bus(
> > + dev->subordinate);
>
> It's a shame that pcibios_resource_survey_bus() can't be called twice.
> It would be nice if it were smart enough to notice on the second call
> that "oh, resources have already been assigned, so there's nothing
> more to be done." Did you look to see whether that's feasible?

I didn't but now that you mentioned I went and checked. I'm pretty sure we
can handle it there in the next revision and drop this change.

2013-06-27 13:00:10

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 2/6] PCI: acpiphp: enable_device(): rescan even if no new devices on slot

On Wed, Jun 26, 2013 at 06:20:24PM -0700, Yinghai Lu wrote:
> On Tue, Jun 25, 2013 at 9:22 AM, Mika Westerberg
> <[email protected]> wrote:
> > From: "Kirill A. Shutemov" <[email protected]>
> >
> > pci_scan_slot() returns number of new devices connected *directly*
> > connected to the slot. Current enable_device() checks the return value
> > and stops if it doesn't see a new device.
> >
> > In Thunderbolt chaining case the new device can be deeper in hierarchy, so
> > do the rescan anyway.
> >
> > Because of that we must make sure that pcibios_resource_survey_bus() and
> > check_hotplug_bridge() get called only for a just found bus and not the
> > ones already added to the system. Failure to do so will lead to resource
> > conflicts.
> >
> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > Signed-off-by: Mika Westerberg <[email protected]>
> > ---
> > drivers/pci/hotplug/acpiphp_glue.c | 18 +++++++-----------
> > 1 file changed, 7 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > index b983e29..80a6ea1 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -685,18 +685,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> > struct pci_dev *dev;
> > struct pci_bus *bus = slot->bridge->pci_bus;
> > struct acpiphp_func *func;
> > - int num, max, pass;
> > + int max, pass;
> > LIST_HEAD(add_list);
> >
> > list_for_each_entry(func, &slot->funcs, sibling)
> > acpiphp_bus_add(func);
> >
> > - num = pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
> > - if (num == 0) {
> > - /* Maybe only part of funcs are added. */
> > - dbg("No new device found\n");
> > - goto err_exit;
> > - }
> > + pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
> >
> > max = acpiphp_max_busnr(bus);
> > for (pass = 0; pass < 2; pass++) {
> > @@ -707,8 +702,11 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> > dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
> > max = pci_scan_bridge(bus, dev, max, pass);
> > if (pass && dev->subordinate) {
> > - check_hotplug_bridge(slot, dev);
> > - pcibios_resource_survey_bus(dev->subordinate);
> > + if (!dev->subordinate->is_added) {
> > + check_hotplug_bridge(slot, dev);
> > + pcibios_resource_survey_bus(
> > + dev->subordinate);
> > + }
>
> No,
> this change will totally disable calling
> check_hotplug_bridge/pcibios_resource_survey_bus()
>
> as pci_scan_bridge/pci_scan_child_bus will set dev->subordinate->is_added...
>
> Please note: recently is_added setting is moved early before
> pci_bus_add_devices...

OK, thanks.

We will handle this in pcibios_resource_survey_bus() instead as suggested
by Bjorn.

2013-06-27 13:06:11

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 04:15:01PM -0600, Alex Williamson wrote:
> On Wed, 2013-06-26 at 14:59 -0600, Bjorn Helgaas wrote:
> > [+cc Alex]
> >
> > On Wed, Jun 26, 2013 at 6:17 AM, Mika Westerberg
> > <[email protected]> wrote:
> > > On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
> > >> On Tue, 25 Jun 2013 19:22:10 +0300
> > >> Mika Westerberg <[email protected]> wrote:
> > >>
> > >> > + if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> > >> > + pr_info("Thunderbolt host router detected disabling ROMs\n");
> > >> > + pci_probe |= PCI_NOASSIGN_ROMS;
> > >> > + }
> > >>
> > >> I wonder if this should just be the default on x86? Or do we allocate
> > >> ROM space to address some other platform where we need it and the BIOS
> > >> doesn't do it for the devices we care about?
> > >
> > > Good question. In our case it definitely helps to have pci=norom the
> > > default. Can't tell if it might break something that depends on the current
> > > behaviour.
> >
> > I think the current default behavior is that if the BIOS has assigned
> > the ROM BAR, we keep that assignment, and if it hasn't, we allocate
> > MEM space for it. And "pci=norom" means that we don't allocate MEM
> > space for it, even if the BIOS hasn't assigned it.
> >
> > "pci=norom" is only implemented on x86. I think most other arches
> > allocate MEM space for ROMs, with no way to turn that off. PA-RISC
> > seems to ignore ROMs (dino_fixup_bus()), but that looks like the
> > exception.
> >
> > I'm slightly concerned that if we make the x86 default be "never
> > assign space for ROMs unless the BIOS has done it," we might break
> > virtualized guests that need access to ROMs. Alex?
>
> Yep, I'm more than slightly concerned by that too. Whenever possible we
> want to pass the ROM to the guest since it may end up being a boot
> device or drivers within the guest may require it. We can pass the ROM
> to the guest from an image file, but that requires someplace from which
> to dump the image, which is usually PCI sysfs. Thanks,

OK, then I guess changing the default is out of the question. We don't want
to break things.

2013-06-27 13:22:58

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

Bjorn Helgaas wrote:
> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> <[email protected]> wrote:
> > From: "Kirill A. Shutemov" <[email protected]>
> >
> > With Thunderbolt you can chain devices: connect a new devices to plugged
> > one. In this case the slot is already enabled, but we still want to look
> > for new devices behind it.
> >
> > We're going to reuse enable_device() for rescan for new devices on the
> > enabled slot. Let's push the check up by stack.
> >
> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > Signed-off-by: Mika Westerberg <[email protected]>
> > ---
> > drivers/pci/hotplug/acpiphp_glue.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > index 59df857..b983e29 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -688,9 +688,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> > int num, max, pass;
> > LIST_HEAD(add_list);
> >
> > - if (slot->flags & SLOT_ENABLED)
> > - goto err_exit;
> > -
> > list_for_each_entry(func, &slot->funcs, sibling)
> > acpiphp_bus_add(func);
> >
> > @@ -1242,6 +1239,8 @@ int acpiphp_enable_slot(struct acpiphp_slot *slot)
> > goto err_exit;
> >
> > if (get_slot_status(slot) == ACPI_STA_ALL) {
> > + if (slot->flags & SLOT_ENABLED)
> > + goto err_exit;
>
> Why do we check for SLOT_ENABLED at all? I think we're handling a Bus
> Check notification, which means "re-enumerate on the device tree
> starting from the notification point." It doesn't say anything about
> skipping the re-enumeration if we find a device that's already
> enabled.
>
> It seems like we ought to just re-enumerate all the way down in case a
> device was added farther down in the tree (which is what it sounds
> like Thunderbolt is doing).

Okay, we will do more cleanup here.

The tricky part here is that acpiphp_enable_slot() is exposed to userspace
directly: /sys/bus/pci/slots/*/power files.

I wounder if complete drop of the check is an ABI change.

--
Kirill A. Shutemov

2013-06-27 13:50:08

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 04:18:53PM -0600, Bjorn Helgaas wrote:
> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> <[email protected]> wrote:
> > Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
> > This means that the BIOS will allocate bridge resources based on some
> > assumptions of a maximum Thunderbolt chain. It also disables native PCIe
> > hotplug of the root port where the Thunderbolt host router is connected.
>
> The BIOS often assigns PCI bridge windows, e.g., for root ports
> leading to ExpressCard slots. I assume BIOSes make similar
> assumptions there about what ExpressCards are likely to be plugged in.
> Is your BIOS assistance the same sort of thing, or is there something
> additional happening at hot-add time? (I think there might be AML
> that does Thunderbolt-specific stuff at hotplug-time, but I assume
> that's not related to resource assignment, because the OS owns those
> resources.)

Yes, if I understand it right the BIOS gets SCI on hotplug, then it
enumerates and configures the devices, and finally sends an ACPI bus check
event to the OS.

> > In order to support this we must make sure that the kernel does not try to
> > be too smart and resize / open the bridge windows during PCI enumeration.
> > For example by default the kernel will add certain amount of space to the
> > bridge memory/io windows (this is configurable via pci=hp[mem|io]size=xxx
> > command line option). Eventually we run out of space that the BIOS has
> > allocated.
>
> ROMs usually aren't very big compared to regular memory BARs. Is the
> problem just that adding space the BIOS didn't plan for causes the OS
> to increase the window size and bump into space assigned to a peer of
> the Thunderbolt controller?

You are right.

I did some more investigation on this and the BIOS seems to close the
bridge windows if there are no devices behind the bridge or the device is
not using certain type of resource (io/pmem). However, Linux then finds out
that the bridge supports these optional windows (in pci_bridge_check_ranges())
and because is_hotplug_bridge == 1, it adds the default sizes to the bridge
window resources causing this:

[ 15.753262] pci 0000:07:06.0: scanning [bus 80-80] behind bridge, pass 0
[ 15.754458] pci_bus 0000:80: scanning bus
[ 15.755563] pci_bus 0000:80: fixups for bus
[ 15.756668] pci 0000:07:06.0: PCI bridge to [bus 80]
...
[ 15.873433] pci 0000:07:06.0: BAR 7: can't assign io (size 0x1000)
[ 15.874542] pci 0000:07:06.0: BAR 8: can't assign mem (size 0x200000)
[ 15.875632] pci 0000:07:06.0: BAR 9: can't assign mem pref (size 0x200000)

The above bridge has all the windows closed by the BIOS.

If I have "pci=hpmemsize=0,hpiosize=0" in the kernel command line this
works.

> > Also address space for expansion ROMs should not be allocated (BIOS does
> > not execute them for Thunderbolt endpoints). If we don't prevent this the
> > kernel might find expansion ROM associated with some endpoint and reopen
> > the bridge window which the BIOS already closed leading again resource
> > exhaustion.
>
> I assume the only reason we should not allocate expansion ROM space is
> to avoid resource exhaustion. If we had enough resources, allocating
> ROM space wouldn't cause anything bad to happen, would it?

Right.

> > Fix this by adding a quirk that matches known Thunderbolt PCI-to-PCI
> > bridges and in that case prevents allocation of expansion ROM resources and
> > makes sure that the PCI core does not increase size of bridge windows.
> >
> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > Signed-off-by: Mika Westerberg <[email protected]>
> > ---
> > arch/x86/pci/fixup.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 51 insertions(+)
> >
> > diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> > index f5809fa..924822b 100644
> > --- a/arch/x86/pci/fixup.c
> > +++ b/arch/x86/pci/fixup.c
> > @@ -7,6 +7,8 @@
> > #include <linux/pci.h>
> > #include <linux/init.h>
> > #include <linux/vgaarb.h>
> > +#include <linux/acpi.h>
> > +#include <linux/pci-acpi.h>
> > #include <asm/pci_x86.h>
> >
> > static void pci_fixup_i450nx(struct pci_dev *d)
> > @@ -539,3 +541,52 @@ static void twinhead_reserve_killing_zone(struct pci_dev *dev)
> > }
> > }
> > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x27B9, twinhead_reserve_killing_zone);
> > +
> > +#ifdef CONFIG_ACPI
> > +/*
> > + * BIOS assisted Thunderbolt PCI enumeration should handle all resource
> > + * allocation on behalf of OS.
> > + */
> > +static void quirk_thunderbolt(struct pci_dev *dev)
> > +{
> > + struct acpi_pci_root *root;
> > + acpi_handle handle;
> > +
> > + handle = acpi_find_root_bridge_handle(dev);
> > + if (!handle)
> > + return;
> > +
> > + root = acpi_pci_find_root(handle);
> > + if (!root)
> > + return;
> > +
> > + /*
> > + * Native PCIe hotplug should be disabled when BIOS assisted
> > + * hotplug is in use.
> > + */
> > + if (root->osc_control_set & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)
> > + return;
>
> I'm not really sure why this test is here. I dimly recall hearing
> that Thunderbolt hotplug requires some work at hotplug-time, and this
> is not all public, and hence is done by AML. But that in itself
> doesn't seem related to the question of allocating ROM space.

The check is here because we need to check that the native PCIe hotplug is
disabled by the BIOS. Thunderbolt supports standard PCIe hotplug but some
operating systems (not including Linux) can't handle that properly so BIOS
will do that on behalf of the OS.

It is possible that some systems use native PCIe hotplug (although we
haven't seen such yet). Hence we need to check it here and not apply the
quirk in that case.

> > + /*
> > + * Make sure that we don't allocate resources for expansion ROMs.
> > + * This may accidentally increase the size of the bridge window
> > + * causing us to run out of resources.
> > + */
> > + if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> > + pr_info("Thunderbolt host router detected disabling ROMs\n");
>
> We have a pci_dev, so this should use dev_info().

OK.

> > + pci_probe |= PCI_NOASSIGN_ROMS;
>
> I think you really only care about the space for ROMs on devices
> connected via Thunderbolt, so it's kind of a shame that the switch is
> global and affects ROMs on *all* devices. I guess there's nothing
> simple to be done about that, though.

Well, now that I understand this a bit better, I think we don't need to
set the PCI_NOASSIGN_ROMS anymore...

> > + }
> > +
> > + /*
> > + * Don't add anything to the BIOS allocated bridge window size for
> > + * the same reason.
> > + */
> > + dev->is_hotplug_bridge = 0;

...nor do this

I think that we can get this working so that we add a new flag to struct
pci_dev, something like 'no_additional_hotplug_bus_space' and in this quirk
set that.

Then in __pci_bus_size_bridges() we do:

pci_bridge_check_ranges(bus);
if (bus->self->is_hotplug_bridge &&
!bus->self->no_additional_hotplug_bus_space) {
additional_io_size = pci_hotplug_io_size;
additional_mem_size = pci_hotplug_mem_size;
}

This should prevent the problem this patch was trying to solve. Does that
work for you?

Of course, once we do that the user is not able to change the defaults for
Thunderbolt PCI bridges anymore by passing new values from the kernel
command line. Not sure if it is needed, though.

2013-06-27 13:54:35

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 03:26:59PM -0700, Yinghai Lu wrote:
> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <[email protected]> wrote:
> > On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> > <[email protected]> wrote:
> >> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
> >> This means that the BIOS will allocate bridge resources based on some
> >> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
> >> hotplug of the root port where the Thunderbolt host router is connected.
>
> We should not need tricks in this patch after
>
> https://patchwork.kernel.org/patch/2766521/
>
> [2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug

Unfortunately that patch is not enough :-( We still need to make sure that
we don't add anything to the bridge window sizes like __pci_bus_size_bridges()
is currently doing (e.g similar to pci=hpmemsize=0,hpiosize=0 for the
Thunderbolt bridges).

2013-06-27 16:01:12

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Wed, Jun 26, 2013 at 5:56 PM, Yinghai Lu <[email protected]> wrote:
> On Wed, Jun 26, 2013 at 3:55 PM, Bjorn Helgaas <[email protected]> wrote:
>> On Wed, Jun 26, 2013 at 4:31 PM, Yinghai Lu <[email protected]> wrote:
>>> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <[email protected]> wrote:
>>>> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <[email protected]> wrote:
>>>>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>>>>> <[email protected]> wrote:
>>>>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>>>>> This means that the BIOS will allocate bridge resources based on some
>>>>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>>>>> hotplug of the root port where the Thunderbolt host router is connected.
>>> ...
>>> During acpi hotplug, firmare could do extra help for us like assign
>>> some resources to pci device bars, so it is NOT "boot-time".
>>
>> Really? How can firmware assign BARs at hotplug-time? I mean,
>> obviously firmware *can* write things to the BARs before giving the
>> device to the OS, but how would it know what to write?
>
> should be acpi code, or SMI code or even BMC firmware via sideband.

How would that code (ACPI code (by which I assume you mean AML), SMI
code, BMC firmware) know what values to write to BARs? Is it reading
the windows of upstream bridges from config space? Is it assuming the
OS never changes the windows of bridges upstream from the Thunderbolt
controller?

>> I assume the
>> OS owns the address space, and it can change the upstream bridge
>> windows or the BARs of another device on the bus at any time, subject
>> to the OS's own issues as far as quiescing or unbinding drivers, etc.,
>> but without coordinating with the BIOS.
>
> for thunderbolt or dock with acpiphp, then all children devices/bridges should
> not have drivers loaded yet.

I said "upstream bridge windows or ... another device on the bus,"
i.e., I'm referring to devices other than the Thunderbolt controller.
I assume the OS is free to change resource assignments for those
non-Thunderbolt devices, and obviously those assignments affect what
is available for Thunderbolt.

Bjorn

2013-06-27 16:23:42

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Thu, Jun 27, 2013 at 04:54:05PM +0300, Mika Westerberg wrote:
> I think that we can get this working so that we add a new flag to struct
> pci_dev, something like 'no_additional_hotplug_bus_space' and in this quirk
> set that.
>
> Then in __pci_bus_size_bridges() we do:
>
> pci_bridge_check_ranges(bus);
> if (bus->self->is_hotplug_bridge &&
> !bus->self->no_additional_hotplug_bus_space) {
> additional_io_size = pci_hotplug_io_size;
> additional_mem_size = pci_hotplug_mem_size;
> }
>
> This should prevent the problem this patch was trying to solve. Does that
> work for you?

Forget about that -- It looks like these messages are harmless:

pcieport 0000:0a:05.0: BAR 8: can't assign mem (size 0x200000)
pcieport 0000:0a:05.0: BAR 9: can't assign mem pref (size 0x200000)

It just means that we tried to allocate the resource but failed because the
bridge has that window closed, if I got it right. If user doesn't want to
see those, he/she can always pass 'pci=hpmensize=0,hpiosize=0' in the
kernel command line.

With pcibios_resource_survey_bus() fix, looks like this quirk is not needed
at all.

2013-06-27 16:28:48

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 2/6] PCI: acpiphp: enable_device(): rescan even if no new devices on slot

On Thu, Jun 27, 2013 at 04:02:29PM +0300, Mika Westerberg wrote:
> On Wed, Jun 26, 2013 at 05:37:58PM -0600, Bjorn Helgaas wrote:
> > > @@ -707,8 +702,11 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> > > dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
> > > max = pci_scan_bridge(bus, dev, max, pass);
> > > if (pass && dev->subordinate) {
> > > - check_hotplug_bridge(slot, dev);
> > > - pcibios_resource_survey_bus(dev->subordinate);
> > > + if (!dev->subordinate->is_added) {
> > > + check_hotplug_bridge(slot, dev);
> > > + pcibios_resource_survey_bus(
> > > + dev->subordinate);
> >
> > It's a shame that pcibios_resource_survey_bus() can't be called twice.
> > It would be nice if it were smart enough to notice on the second call
> > that "oh, resources have already been assigned, so there's nothing
> > more to be done." Did you look to see whether that's feasible?
>
> I didn't but now that you mentioned I went and checked. I'm pretty sure we
> can handle it there in the next revision and drop this change.

With following patch that fixes pcibios_resource_survey_bus() Thunderbolt
seems to be working just fine (and the quirk is gone).

If it looks okay, I'll include it into the next iteration of the patches.

diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index 94919e3..db6b1ab 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -210,6 +210,8 @@ static void pcibios_allocate_bridge_resources(struct pci_dev *dev)
r = &dev->resource[idx];
if (!r->flags)
continue;
+ if (r->parent) /* Already allocated */
+ continue;
if (!r->start || pci_claim_resource(dev, idx) < 0) {
/*
* Something is wrong with the region.
@@ -318,6 +320,8 @@ static void pcibios_allocate_dev_rom_resource(struct pci_dev *dev)
r = &dev->resource[PCI_ROM_RESOURCE];
if (!r->flags || !r->start)
return;
+ if (r->parent) /* Already allocated */
+ return;

if (pci_claim_resource(dev, PCI_ROM_RESOURCE) < 0) {
r->end -= r->start;

2013-06-27 16:50:53

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 2/6] PCI: acpiphp: enable_device(): rescan even if no new devices on slot

On Thu, Jun 27, 2013 at 10:32 AM, Mika Westerberg
<[email protected]> wrote:
> On Thu, Jun 27, 2013 at 04:02:29PM +0300, Mika Westerberg wrote:
>> On Wed, Jun 26, 2013 at 05:37:58PM -0600, Bjorn Helgaas wrote:
>> > > @@ -707,8 +702,11 @@ static int __ref enable_device(struct acpiphp_slot *slot)
>> > > dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
>> > > max = pci_scan_bridge(bus, dev, max, pass);
>> > > if (pass && dev->subordinate) {
>> > > - check_hotplug_bridge(slot, dev);
>> > > - pcibios_resource_survey_bus(dev->subordinate);
>> > > + if (!dev->subordinate->is_added) {
>> > > + check_hotplug_bridge(slot, dev);
>> > > + pcibios_resource_survey_bus(
>> > > + dev->subordinate);
>> >
>> > It's a shame that pcibios_resource_survey_bus() can't be called twice.
>> > It would be nice if it were smart enough to notice on the second call
>> > that "oh, resources have already been assigned, so there's nothing
>> > more to be done." Did you look to see whether that's feasible?
>>
>> I didn't but now that you mentioned I went and checked. I'm pretty sure we
>> can handle it there in the next revision and drop this change.
>
> With following patch that fixes pcibios_resource_survey_bus() Thunderbolt
> seems to be working just fine (and the quirk is gone).
>
> If it looks okay, I'll include it into the next iteration of the patches.

That looks great, thanks a lot for exploring this!

> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
> index 94919e3..db6b1ab 100644
> --- a/arch/x86/pci/i386.c
> +++ b/arch/x86/pci/i386.c
> @@ -210,6 +210,8 @@ static void pcibios_allocate_bridge_resources(struct pci_dev *dev)
> r = &dev->resource[idx];
> if (!r->flags)
> continue;
> + if (r->parent) /* Already allocated */
> + continue;
> if (!r->start || pci_claim_resource(dev, idx) < 0) {
> /*
> * Something is wrong with the region.
> @@ -318,6 +320,8 @@ static void pcibios_allocate_dev_rom_resource(struct pci_dev *dev)
> r = &dev->resource[PCI_ROM_RESOURCE];
> if (!r->flags || !r->start)
> return;
> + if (r->parent) /* Already allocated */
> + return;
>
> if (pci_claim_resource(dev, PCI_ROM_RESOURCE) < 0) {
> r->end -= r->start;

2013-06-27 16:54:51

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 2/6] PCI: acpiphp: enable_device(): rescan even if no new devices on slot

On Thu, Jun 27, 2013 at 10:32 AM, Mika Westerberg
<[email protected]> wrote:
> On Thu, Jun 27, 2013 at 04:02:29PM +0300, Mika Westerberg wrote:
>> On Wed, Jun 26, 2013 at 05:37:58PM -0600, Bjorn Helgaas wrote:
>> > > @@ -707,8 +702,11 @@ static int __ref enable_device(struct acpiphp_slot *slot)
>> > > dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
>> > > max = pci_scan_bridge(bus, dev, max, pass);
>> > > if (pass && dev->subordinate) {
>> > > - check_hotplug_bridge(slot, dev);
>> > > - pcibios_resource_survey_bus(dev->subordinate);
>> > > + if (!dev->subordinate->is_added) {
>> > > + check_hotplug_bridge(slot, dev);
>> > > + pcibios_resource_survey_bus(
>> > > + dev->subordinate);
>> >
>> > It's a shame that pcibios_resource_survey_bus() can't be called twice.
>> > It would be nice if it were smart enough to notice on the second call
>> > that "oh, resources have already been assigned, so there's nothing
>> > more to be done." Did you look to see whether that's feasible?
>>
>> I didn't but now that you mentioned I went and checked. I'm pretty sure we
>> can handle it there in the next revision and drop this change.
>
> With following patch that fixes pcibios_resource_survey_bus() Thunderbolt
> seems to be working just fine (and the quirk is gone).
>
> If it looks okay, I'll include it into the next iteration of the patches.

That looks great, thanks a lot for exploring this!

> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
> index 94919e3..db6b1ab 100644
> --- a/arch/x86/pci/i386.c
> +++ b/arch/x86/pci/i386.c
> @@ -210,6 +210,8 @@ static void pcibios_allocate_bridge_resources(struct pci_dev *dev)
> r = &dev->resource[idx];
> if (!r->flags)
> continue;
> + if (r->parent) /* Already allocated */
> + continue;
> if (!r->start || pci_claim_resource(dev, idx) < 0) {
> /*
> * Something is wrong with the region.
> @@ -318,6 +320,8 @@ static void pcibios_allocate_dev_rom_resource(struct pci_dev *dev)
> r = &dev->resource[PCI_ROM_RESOURCE];
> if (!r->flags || !r->start)
> return;
> + if (r->parent) /* Already allocated */
> + return;
>
> if (pci_claim_resource(dev, PCI_ROM_RESOURCE) < 0) {
> r->end -= r->start;

2013-06-27 17:18:22

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Thu, Jun 27, 2013 at 9:27 AM, Mika Westerberg
<[email protected]> wrote:
> On Thu, Jun 27, 2013 at 04:54:05PM +0300, Mika Westerberg wrote:
>> I think that we can get this working so that we add a new flag to struct
>> pci_dev, something like 'no_additional_hotplug_bus_space' and in this quirk
>> set that.
>>
>> Then in __pci_bus_size_bridges() we do:
>>
>> pci_bridge_check_ranges(bus);
>> if (bus->self->is_hotplug_bridge &&
>> !bus->self->no_additional_hotplug_bus_space) {
>> additional_io_size = pci_hotplug_io_size;
>> additional_mem_size = pci_hotplug_mem_size;
>> }
>>
>> This should prevent the problem this patch was trying to solve. Does that
>> work for you?
>
> Forget about that -- It looks like these messages are harmless:
>
> pcieport 0000:0a:05.0: BAR 8: can't assign mem (size 0x200000)
> pcieport 0000:0a:05.0: BAR 9: can't assign mem pref (size 0x200000)
>
> It just means that we tried to allocate the resource but failed because the
> bridge has that window closed, if I got it right. If user doesn't want to
> see those, he/she can always pass 'pci=hpmensize=0,hpiosize=0' in the
> kernel command line.

Yes, that extra range for hotplug is optional, so if must+optional
fails, second try
will be must only.

>
> With pcibios_resource_survey_bus() fix, looks like this quirk is not needed
> at all.

Good.

2013-06-27 17:28:01

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

On Thu, Jun 27, 2013 at 9:00 AM, Bjorn Helgaas <[email protected]> wrote:
> On Wed, Jun 26, 2013 at 5:56 PM, Yinghai Lu <[email protected]> wrote:
>> On Wed, Jun 26, 2013 at 3:55 PM, Bjorn Helgaas <[email protected]> wrote:
>>> On Wed, Jun 26, 2013 at 4:31 PM, Yinghai Lu <[email protected]> wrote:
>>>> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <[email protected]> wrote:
>>>>> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <[email protected]> wrote:
>>>>>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>>>>>> <[email protected]> wrote:
>>>>>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>>>>>> This means that the BIOS will allocate bridge resources based on some
>>>>>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>>>>>> hotplug of the root port where the Thunderbolt host router is connected.
>>>> ...
>>>> During acpi hotplug, firmare could do extra help for us like assign
>>>> some resources to pci device bars, so it is NOT "boot-time".
>>>
>>> Really? How can firmware assign BARs at hotplug-time? I mean,
>>> obviously firmware *can* write things to the BARs before giving the
>>> device to the OS, but how would it know what to write?
>>
>> should be acpi code, or SMI code or even BMC firmware via sideband.
>
> How would that code (ACPI code (by which I assume you mean AML), SMI
> code, BMC firmware) know what values to write to BARs? Is it reading
> the windows of upstream bridges from config space? Is it assuming the
> OS never changes the windows of bridges upstream from the Thunderbolt
> controller?

Kernel only try to change upstream bridge for pciehp during hot-add.

for acpiphp, because could have other devices already there before hot-add,
kernel does not try to expand the upstream bridge. size and assign only
to the new added devices.

>
>>> I assume the
>>> OS owns the address space, and it can change the upstream bridge
>>> windows or the BARs of another device on the bus at any time, subject
>>> to the OS's own issues as far as quiescing or unbinding drivers, etc.,
>>> but without coordinating with the BIOS.
>>
>> for thunderbolt or dock with acpiphp, then all children devices/bridges should
>> not have drivers loaded yet.
>
> I said "upstream bridge windows or ... another device on the bus,"
> i.e., I'm referring to devices other than the Thunderbolt controller.
> I assume the OS is free to change resource assignments for those
> non-Thunderbolt devices, and obviously those assignments affect what
> is available for Thunderbolt.

ok. that upstream one can not be touched from firmware.

2013-06-27 19:05:05

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 4/6] PCI: acpiphp: check for new devices on enabled host

On Tue, Jun 25, 2013 at 9:22 AM, Mika Westerberg
<[email protected]> wrote:
> From: "Kirill A. Shutemov" <[email protected]>
>
> Current acpiphp_check_bridge() implementation is pretty dumb:
> - it enables the slot if it's not enabled and the slot status is
> ACPI_STA_ALL;
> - it disables the slot if it's enabled and slot is not in ACPI_STA_ALL
> state.
>
> This behavior is not enough to handle Thunderbolt chaining case
> properly. We need to actually rescan for new devices even if a device
> has already in the slot.
>
> The new implementation disables and stops the slot if it's not in
> ACPI_STA_ALL state.
>
> For ACPI_STA_ALL state we first trim devices which don't respond and
> look for the ones after that. We do that even if slot already enabled
> (SLOT_ENABLED).

that is not right, some time BUS_CHECK is even sent root bus.
in that case, stop all devices in slots and load driver again.

like you put one card in one slots, but all devices in other slots get stop
and enable again.

Thanks

Yinghai

2013-06-28 09:30:49

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 4/6] PCI: acpiphp: check for new devices on enabled host

Yinghai Lu wrote:
> On Tue, Jun 25, 2013 at 9:22 AM, Mika Westerberg
> <[email protected]> wrote:
> > From: "Kirill A. Shutemov" <[email protected]>
> >
> > Current acpiphp_check_bridge() implementation is pretty dumb:
> > - it enables the slot if it's not enabled and the slot status is
> > ACPI_STA_ALL;
> > - it disables the slot if it's enabled and slot is not in ACPI_STA_ALL
> > state.
> >
> > This behavior is not enough to handle Thunderbolt chaining case
> > properly. We need to actually rescan for new devices even if a device
> > has already in the slot.
> >
> > The new implementation disables and stops the slot if it's not in
> > ACPI_STA_ALL state.
> >
> > For ACPI_STA_ALL state we first trim devices which don't respond and
> > look for the ones after that. We do that even if slot already enabled
> > (SLOT_ENABLED).
>
> that is not right, some time BUS_CHECK is even sent root bus.
> in that case, stop all devices in slots and load driver again.
>
> like you put one card in one slots, but all devices in other slots get stop
> and enable again.

We don't stop enabled devices, we only stop and remove devices which don't
respond. See patch 3/6.

I don't see how it's harmful. Do you?

--
Kirill A. Shutemov

2013-06-28 10:05:51

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

Bjorn Helgaas wrote:
> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> <[email protected]> wrote:
> > From: "Kirill A. Shutemov" <[email protected]>
> >
> > With Thunderbolt you can chain devices: connect a new devices to plugged
> > one. In this case the slot is already enabled, but we still want to look
> > for new devices behind it.
> >
> > We're going to reuse enable_device() for rescan for new devices on the
> > enabled slot. Let's push the check up by stack.
> >
> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > Signed-off-by: Mika Westerberg <[email protected]>
> > ---
> > drivers/pci/hotplug/acpiphp_glue.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > index 59df857..b983e29 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -688,9 +688,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> > int num, max, pass;
> > LIST_HEAD(add_list);
> >
> > - if (slot->flags & SLOT_ENABLED)
> > - goto err_exit;
> > -
> > list_for_each_entry(func, &slot->funcs, sibling)
> > acpiphp_bus_add(func);
> >
> > @@ -1242,6 +1239,8 @@ int acpiphp_enable_slot(struct acpiphp_slot *slot)
> > goto err_exit;
> >
> > if (get_slot_status(slot) == ACPI_STA_ALL) {
> > + if (slot->flags & SLOT_ENABLED)
> > + goto err_exit;
>
> Why do we check for SLOT_ENABLED at all? I think we're handling a Bus
> Check notification, which means "re-enumerate on the device tree
> starting from the notification point." It doesn't say anything about
> skipping the re-enumeration if we find a device that's already
> enabled.
>
> It seems like we ought to just re-enumerate all the way down in case a
> device was added farther down in the tree (which is what it sounds
> like Thunderbolt is doing).

Currently (with patchset applied), we have two users of
acpiphp_enable_slot():

- /sys/bus/pci/slots/*/power
- ACPI_NOTIFY_BUS_CHECK in _handle_hotplug_event_func().

Both are not related to Thunderbolt.

Although, I think remove the check is good idea, I prefer to keep it
separate from Thunderbolt enabling patchset, since it will change sysfs
ABI a bit and can potentially affect othe ACPI PCI hotplug
implementations.

--
Kirill A. Shutemov

2013-06-28 16:22:35

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 4/6] PCI: acpiphp: check for new devices on enabled host

On Fri, Jun 28, 2013 at 2:33 AM, Kirill A. Shutemov
<[email protected]> wrote:
> Yinghai Lu wrote:
>> On Tue, Jun 25, 2013 at 9:22 AM, Mika Westerberg
>> <[email protected]> wrote:
>> > From: "Kirill A. Shutemov" <[email protected]>
>> >
>> > Current acpiphp_check_bridge() implementation is pretty dumb:
>> > - it enables the slot if it's not enabled and the slot status is
>> > ACPI_STA_ALL;
>> > - it disables the slot if it's enabled and slot is not in ACPI_STA_ALL
>> > state.
>> >
>> > This behavior is not enough to handle Thunderbolt chaining case
>> > properly. We need to actually rescan for new devices even if a device
>> > has already in the slot.
>> >
>> > The new implementation disables and stops the slot if it's not in
>> > ACPI_STA_ALL state.
>> >
>> > For ACPI_STA_ALL state we first trim devices which don't respond and
>> > look for the ones after that. We do that even if slot already enabled
>> > (SLOT_ENABLED).
>>
>> that is not right, some time BUS_CHECK is even sent root bus.
>> in that case, stop all devices in slots and load driver again.
>>
>> like you put one card in one slots, but all devices in other slots get stop
>> and enable again.
>
> We don't stop enabled devices, we only stop and remove devices which don't
> respond. See patch 3/6.
>
> I don't see how it's harmful. Do you?

then please check with disable_device to put back pci_dev ref,
also may need to trim corresponding acpi devices.

so this patch is helping: multiple plug-in and remove?

Yinghai

2013-06-28 17:00:54

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Fri, Jun 28, 2013 at 3:51 AM, Kirill A. Shutemov
<[email protected]> wrote:
> Bjorn Helgaas wrote:
>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>> <[email protected]> wrote:
>> > From: "Kirill A. Shutemov" <[email protected]>
>> >
>> > With Thunderbolt you can chain devices: connect a new devices to plugged
>> > one. In this case the slot is already enabled, but we still want to look
>> > for new devices behind it.
>> >
>> > We're going to reuse enable_device() for rescan for new devices on the
>> > enabled slot. Let's push the check up by stack.
>> >
>> > Signed-off-by: Kirill A. Shutemov <[email protected]>
>> > Signed-off-by: Mika Westerberg <[email protected]>
>> > ---
>> > drivers/pci/hotplug/acpiphp_glue.c | 5 ++---
>> > 1 file changed, 2 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
>> > index 59df857..b983e29 100644
>> > --- a/drivers/pci/hotplug/acpiphp_glue.c
>> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
>> > @@ -688,9 +688,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
>> > int num, max, pass;
>> > LIST_HEAD(add_list);
>> >
>> > - if (slot->flags & SLOT_ENABLED)
>> > - goto err_exit;
>> > -
>> > list_for_each_entry(func, &slot->funcs, sibling)
>> > acpiphp_bus_add(func);
>> >
>> > @@ -1242,6 +1239,8 @@ int acpiphp_enable_slot(struct acpiphp_slot *slot)
>> > goto err_exit;
>> >
>> > if (get_slot_status(slot) == ACPI_STA_ALL) {
>> > + if (slot->flags & SLOT_ENABLED)
>> > + goto err_exit;
>>
>> Why do we check for SLOT_ENABLED at all? I think we're handling a Bus
>> Check notification, which means "re-enumerate on the device tree
>> starting from the notification point." It doesn't say anything about
>> skipping the re-enumeration if we find a device that's already
>> enabled.
>>
>> It seems like we ought to just re-enumerate all the way down in case a
>> device was added farther down in the tree (which is what it sounds
>> like Thunderbolt is doing).
>
> Currently (with patchset applied), we have two users of
> acpiphp_enable_slot():
>
> - /sys/bus/pci/slots/*/power
> - ACPI_NOTIFY_BUS_CHECK in _handle_hotplug_event_func().
>
> Both are not related to Thunderbolt.
>
> Although, I think remove the check is good idea, I prefer to keep it
> separate from Thunderbolt enabling patchset, since it will change sysfs
> ABI a bit and can potentially affect othe ACPI PCI hotplug
> implementations.

I'll think about this some more, but if we can make a change that
simplifies things and makes them more spec-compliant, and also happens
to make Thunderbolt work, that sounds better than fixing Thunderbolt
while leaving the wart there.

If we only fix Thunderbolt, it just feels like we're adding to an
ever-growing "deferred maintenance" list.

Bjorn

2013-06-28 18:45:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Friday, June 28, 2013 11:00:31 AM Bjorn Helgaas wrote:
> On Fri, Jun 28, 2013 at 3:51 AM, Kirill A. Shutemov
> <[email protected]> wrote:
> > Bjorn Helgaas wrote:
> >> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> >> <[email protected]> wrote:
> >> > From: "Kirill A. Shutemov" <[email protected]>
> >> >
> >> > With Thunderbolt you can chain devices: connect a new devices to plugged
> >> > one. In this case the slot is already enabled, but we still want to look
> >> > for new devices behind it.
> >> >
> >> > We're going to reuse enable_device() for rescan for new devices on the
> >> > enabled slot. Let's push the check up by stack.
> >> >
> >> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> >> > Signed-off-by: Mika Westerberg <[email protected]>
> >> > ---
> >> > drivers/pci/hotplug/acpiphp_glue.c | 5 ++---
> >> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >> >
> >> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> >> > index 59df857..b983e29 100644
> >> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> >> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> >> > @@ -688,9 +688,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> >> > int num, max, pass;
> >> > LIST_HEAD(add_list);
> >> >
> >> > - if (slot->flags & SLOT_ENABLED)
> >> > - goto err_exit;
> >> > -
> >> > list_for_each_entry(func, &slot->funcs, sibling)
> >> > acpiphp_bus_add(func);
> >> >
> >> > @@ -1242,6 +1239,8 @@ int acpiphp_enable_slot(struct acpiphp_slot *slot)
> >> > goto err_exit;
> >> >
> >> > if (get_slot_status(slot) == ACPI_STA_ALL) {
> >> > + if (slot->flags & SLOT_ENABLED)
> >> > + goto err_exit;
> >>
> >> Why do we check for SLOT_ENABLED at all? I think we're handling a Bus
> >> Check notification, which means "re-enumerate on the device tree
> >> starting from the notification point." It doesn't say anything about
> >> skipping the re-enumeration if we find a device that's already
> >> enabled.
> >>
> >> It seems like we ought to just re-enumerate all the way down in case a
> >> device was added farther down in the tree (which is what it sounds
> >> like Thunderbolt is doing).
> >
> > Currently (with patchset applied), we have two users of
> > acpiphp_enable_slot():
> >
> > - /sys/bus/pci/slots/*/power
> > - ACPI_NOTIFY_BUS_CHECK in _handle_hotplug_event_func().
> >
> > Both are not related to Thunderbolt.
> >
> > Although, I think remove the check is good idea, I prefer to keep it
> > separate from Thunderbolt enabling patchset, since it will change sysfs
> > ABI a bit and can potentially affect othe ACPI PCI hotplug
> > implementations.
>
> I'll think about this some more, but if we can make a change that
> simplifies things and makes them more spec-compliant, and also happens
> to make Thunderbolt work, that sounds better than fixing Thunderbolt
> while leaving the wart there.
>
> If we only fix Thunderbolt, it just feels like we're adding to an
> ever-growing "deferred maintenance" list.

I agree.

That change may be done in a separate patch, but it should be included in the
series.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-06-28 19:49:58

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 3/6] PCI: introduce pci_trim_stale_devices()

On Tuesday, June 25, 2013 07:22:07 PM Mika Westerberg wrote:
> From: "Kirill A. Shutemov" <[email protected]>
>
> The new function stops and removes the device if it doesn't respond.
> If the device responds and it's a bus we apply the function to all
> subdevices recursively.
>
> It's useful for hotplug bus check case, when you need drop all unplugged
> devices before looking for new ones.
>
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> Signed-off-by: Mika Westerberg <[email protected]>

I think it'd be better to fold this into [4/6]. This way it would be clearer
what that patch did.

Thanks,
Rafael


> ---
> drivers/pci/remove.c | 20 ++++++++++++++++++++
> include/linux/pci.h | 1 +
> 2 files changed, 21 insertions(+)
>
> diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
> index 8fc54b7..77b7a64 100644
> --- a/drivers/pci/remove.c
> +++ b/drivers/pci/remove.c
> @@ -112,6 +112,26 @@ void pci_stop_and_remove_bus_device(struct pci_dev *dev)
> }
> EXPORT_SYMBOL(pci_stop_and_remove_bus_device);
>
> +/**
> + * pci_trim_stale_devices - remove stale device or any stale child
> + */
> +void pci_trim_stale_devices(struct pci_dev *dev)
> +{
> + struct pci_bus *bus = dev->subordinate;
> + struct pci_dev *child, *tmp;
> + bool alive;
> + u32 l;
> +
> + /* check if the device responds */
> + alive = pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &l, 0);
> + if (!alive)
> + pci_stop_and_remove_bus_device(dev);
> +
> + if (alive && bus)
> + list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
> + pci_trim_stale_devices(child);
> +}
> +
> void pci_stop_root_bus(struct pci_bus *bus)
> {
> struct pci_dev *child, *tmp;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 3a24e4f..8f6e7a0 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -754,6 +754,7 @@ struct pci_dev *pci_dev_get(struct pci_dev *dev);
> void pci_dev_put(struct pci_dev *dev);
> void pci_remove_bus(struct pci_bus *b);
> void pci_stop_and_remove_bus_device(struct pci_dev *dev);
> +void pci_trim_stale_devices(struct pci_dev *dev);
> void pci_stop_root_bus(struct pci_bus *bus);
> void pci_remove_root_bus(struct pci_bus *bus);
> void pci_setup_cardbus(struct pci_bus *bus);
>
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-06-28 19:55:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 4/6] PCI: acpiphp: check for new devices on enabled host

On Friday, June 28, 2013 09:22:32 AM Yinghai Lu wrote:
> On Fri, Jun 28, 2013 at 2:33 AM, Kirill A. Shutemov
> <[email protected]> wrote:
> > Yinghai Lu wrote:
> >> On Tue, Jun 25, 2013 at 9:22 AM, Mika Westerberg
> >> <[email protected]> wrote:
> >> > From: "Kirill A. Shutemov" <[email protected]>
> >> >
> >> > Current acpiphp_check_bridge() implementation is pretty dumb:
> >> > - it enables the slot if it's not enabled and the slot status is
> >> > ACPI_STA_ALL;
> >> > - it disables the slot if it's enabled and slot is not in ACPI_STA_ALL
> >> > state.
> >> >
> >> > This behavior is not enough to handle Thunderbolt chaining case
> >> > properly. We need to actually rescan for new devices even if a device
> >> > has already in the slot.
> >> >
> >> > The new implementation disables and stops the slot if it's not in
> >> > ACPI_STA_ALL state.
> >> >
> >> > For ACPI_STA_ALL state we first trim devices which don't respond and
> >> > look for the ones after that. We do that even if slot already enabled
> >> > (SLOT_ENABLED).
> >>
> >> that is not right, some time BUS_CHECK is even sent root bus.
> >> in that case, stop all devices in slots and load driver again.
> >>
> >> like you put one card in one slots, but all devices in other slots get stop
> >> and enable again.
> >
> > We don't stop enabled devices, we only stop and remove devices which don't
> > respond. See patch 3/6.
> >
> > I don't see how it's harmful. Do you?
>
> then please check with disable_device to put back pci_dev ref,
> also may need to trim corresponding acpi devices.

That's correct. Thunderbolt may not need that, but ACPI device objects need
to be trimmed too in general.

> so this patch is helping: multiple plug-in and remove?

I think it helps the case when a Thunderbolt device is removed and we only
get a bus check notify for the controller after the fact.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-07-01 09:28:21

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Fri, Jun 28, 2013 at 08:54:45PM +0200, Rafael J. Wysocki wrote:
> On Friday, June 28, 2013 11:00:31 AM Bjorn Helgaas wrote:
> > On Fri, Jun 28, 2013 at 3:51 AM, Kirill A. Shutemov
> > <[email protected]> wrote:
> > > Bjorn Helgaas wrote:
> > >> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> > >> <[email protected]> wrote:
> > >> > From: "Kirill A. Shutemov" <[email protected]>
> > >> >
> > >> > With Thunderbolt you can chain devices: connect a new devices to plugged
> > >> > one. In this case the slot is already enabled, but we still want to look
> > >> > for new devices behind it.
> > >> >
> > >> > We're going to reuse enable_device() for rescan for new devices on the
> > >> > enabled slot. Let's push the check up by stack.
> > >> >
> > >> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > >> > Signed-off-by: Mika Westerberg <[email protected]>
> > >> > ---
> > >> > drivers/pci/hotplug/acpiphp_glue.c | 5 ++---
> > >> > 1 file changed, 2 insertions(+), 3 deletions(-)
> > >> >
> > >> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > >> > index 59df857..b983e29 100644
> > >> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > >> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > >> > @@ -688,9 +688,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> > >> > int num, max, pass;
> > >> > LIST_HEAD(add_list);
> > >> >
> > >> > - if (slot->flags & SLOT_ENABLED)
> > >> > - goto err_exit;
> > >> > -
> > >> > list_for_each_entry(func, &slot->funcs, sibling)
> > >> > acpiphp_bus_add(func);
> > >> >
> > >> > @@ -1242,6 +1239,8 @@ int acpiphp_enable_slot(struct acpiphp_slot *slot)
> > >> > goto err_exit;
> > >> >
> > >> > if (get_slot_status(slot) == ACPI_STA_ALL) {
> > >> > + if (slot->flags & SLOT_ENABLED)
> > >> > + goto err_exit;
> > >>
> > >> Why do we check for SLOT_ENABLED at all? I think we're handling a Bus
> > >> Check notification, which means "re-enumerate on the device tree
> > >> starting from the notification point." It doesn't say anything about
> > >> skipping the re-enumeration if we find a device that's already
> > >> enabled.
> > >>
> > >> It seems like we ought to just re-enumerate all the way down in case a
> > >> device was added farther down in the tree (which is what it sounds
> > >> like Thunderbolt is doing).
> > >
> > > Currently (with patchset applied), we have two users of
> > > acpiphp_enable_slot():
> > >
> > > - /sys/bus/pci/slots/*/power
> > > - ACPI_NOTIFY_BUS_CHECK in _handle_hotplug_event_func().
> > >
> > > Both are not related to Thunderbolt.
> > >
> > > Although, I think remove the check is good idea, I prefer to keep it
> > > separate from Thunderbolt enabling patchset, since it will change sysfs
> > > ABI a bit and can potentially affect othe ACPI PCI hotplug
> > > implementations.
> >
> > I'll think about this some more, but if we can make a change that
> > simplifies things and makes them more spec-compliant, and also happens
> > to make Thunderbolt work, that sounds better than fixing Thunderbolt
> > while leaving the wart there.
> >
> > If we only fix Thunderbolt, it just feels like we're adding to an
> > ever-growing "deferred maintenance" list.
>
> I agree.
>
> That change may be done in a separate patch, but it should be included in the
> series.

Given the fact that SLOT_ENABLED is only checked in acpiphp_enable_slot()
(after this patch) and that /sys/bus/pci/slots/*/power uses SLOT_POWEREDON
anyway, should we remove the whole flag?

2013-07-01 13:52:05

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Monday, July 01, 2013 12:32:17 PM Mika Westerberg wrote:
> On Fri, Jun 28, 2013 at 08:54:45PM +0200, Rafael J. Wysocki wrote:
> > On Friday, June 28, 2013 11:00:31 AM Bjorn Helgaas wrote:
> > > On Fri, Jun 28, 2013 at 3:51 AM, Kirill A. Shutemov
> > > <[email protected]> wrote:
> > > > Bjorn Helgaas wrote:
> > > >> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> > > >> <[email protected]> wrote:
> > > >> > From: "Kirill A. Shutemov" <[email protected]>
> > > >> >
> > > >> > With Thunderbolt you can chain devices: connect a new devices to plugged
> > > >> > one. In this case the slot is already enabled, but we still want to look
> > > >> > for new devices behind it.
> > > >> >
> > > >> > We're going to reuse enable_device() for rescan for new devices on the
> > > >> > enabled slot. Let's push the check up by stack.
> > > >> >
> > > >> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > > >> > Signed-off-by: Mika Westerberg <[email protected]>
> > > >> > ---
> > > >> > drivers/pci/hotplug/acpiphp_glue.c | 5 ++---
> > > >> > 1 file changed, 2 insertions(+), 3 deletions(-)
> > > >> >
> > > >> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > > >> > index 59df857..b983e29 100644
> > > >> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > > >> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > > >> > @@ -688,9 +688,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> > > >> > int num, max, pass;
> > > >> > LIST_HEAD(add_list);
> > > >> >
> > > >> > - if (slot->flags & SLOT_ENABLED)
> > > >> > - goto err_exit;
> > > >> > -
> > > >> > list_for_each_entry(func, &slot->funcs, sibling)
> > > >> > acpiphp_bus_add(func);
> > > >> >
> > > >> > @@ -1242,6 +1239,8 @@ int acpiphp_enable_slot(struct acpiphp_slot *slot)
> > > >> > goto err_exit;
> > > >> >
> > > >> > if (get_slot_status(slot) == ACPI_STA_ALL) {
> > > >> > + if (slot->flags & SLOT_ENABLED)
> > > >> > + goto err_exit;
> > > >>
> > > >> Why do we check for SLOT_ENABLED at all? I think we're handling a Bus
> > > >> Check notification, which means "re-enumerate on the device tree
> > > >> starting from the notification point." It doesn't say anything about
> > > >> skipping the re-enumeration if we find a device that's already
> > > >> enabled.
> > > >>
> > > >> It seems like we ought to just re-enumerate all the way down in case a
> > > >> device was added farther down in the tree (which is what it sounds
> > > >> like Thunderbolt is doing).
> > > >
> > > > Currently (with patchset applied), we have two users of
> > > > acpiphp_enable_slot():
> > > >
> > > > - /sys/bus/pci/slots/*/power
> > > > - ACPI_NOTIFY_BUS_CHECK in _handle_hotplug_event_func().
> > > >
> > > > Both are not related to Thunderbolt.
> > > >
> > > > Although, I think remove the check is good idea, I prefer to keep it
> > > > separate from Thunderbolt enabling patchset, since it will change sysfs
> > > > ABI a bit and can potentially affect othe ACPI PCI hotplug
> > > > implementations.
> > >
> > > I'll think about this some more, but if we can make a change that
> > > simplifies things and makes them more spec-compliant, and also happens
> > > to make Thunderbolt work, that sounds better than fixing Thunderbolt
> > > while leaving the wart there.
> > >
> > > If we only fix Thunderbolt, it just feels like we're adding to an
> > > ever-growing "deferred maintenance" list.
> >
> > I agree.
> >
> > That change may be done in a separate patch, but it should be included in the
> > series.
>
> Given the fact that SLOT_ENABLED is only checked in acpiphp_enable_slot()
> (after this patch) and that /sys/bus/pci/slots/*/power uses SLOT_POWEREDON
> anyway, should we remove the whole flag?

Sure, if it is not necessary any more, we should remove it.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-07-01 18:32:16

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Mon, Jul 01, 2013 at 04:01:37PM +0200, Rafael J. Wysocki wrote:
> > Given the fact that SLOT_ENABLED is only checked in acpiphp_enable_slot()
> > (after this patch) and that /sys/bus/pci/slots/*/power uses SLOT_POWEREDON
> > anyway, should we remove the whole flag?
>
> Sure, if it is not necessary any more, we should remove it.

Well, there is one thing that changes due that. Once the flag is gone
userspace can do 'echo 1 > /sys/bus/pci/slots/*/power' several times and
the slot is always re-enumerated.

If that is not acceptable we should probably move the SLOT_ENABLED check
closer to acpiphp_core:enable_device() and drop it from here, so that we
always re-enumerate on Bus Check event but userspace can only do enable
once (we still re-enumerate on Bus Check).

2013-07-02 01:20:15

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Monday, July 01, 2013 09:36:13 PM Mika Westerberg wrote:
> On Mon, Jul 01, 2013 at 04:01:37PM +0200, Rafael J. Wysocki wrote:
> > > Given the fact that SLOT_ENABLED is only checked in acpiphp_enable_slot()
> > > (after this patch) and that /sys/bus/pci/slots/*/power uses SLOT_POWEREDON
> > > anyway, should we remove the whole flag?
> >
> > Sure, if it is not necessary any more, we should remove it.
>
> Well, there is one thing that changes due that. Once the flag is gone
> userspace can do 'echo 1 > /sys/bus/pci/slots/*/power' several times and
> the slot is always re-enumerated.
>
> If that is not acceptable we should probably move the SLOT_ENABLED check
> closer to acpiphp_core:enable_device() and drop it from here, so that we
> always re-enumerate on Bus Check event but userspace can only do enable
> once (we still re-enumerate on Bus Check).

Yes, that sounds like the right thing to do.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-07-02 10:41:15

by Kirill A. Shutemov

[permalink] [raw]
Subject: RE: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

Mika Westerberg wrote:
> The acpiphp driver finds out whether the device is hotpluggable by checking
> whether it has _RMV method behind it (and if it returns 1). However, at
> least Acer Aspire S5 with Thunderbolt host router has this method placed
> behind device called EPUP (endpoint upstream port?) and not directly behind
> the root port as can be seen from the ASL code below:
>
> Device (RP05)
> {
> ...
> Device (HRUP)
> {
> Name (_ADR, Zero)
> Name (_PRW, Package (0x02)
> {
> 0x09,
> 0x04
> })
> Device (HRDN)
> {
> Name (_ADR, 0x00040000)
> Name (_PRW, Package (0x02)
> {
> 0x09,
> 0x04
> })
> Device (EPUP)
> {
> Name (_ADR, Zero)
> Method (_RMV, 0, NotSerialized)
> {
> Return (One)
> }
> }
> }
> }
>
> If we want to support such machines we must look for the _RMV method a bit
> deeper in the hierarchy. Fix this by changing pcihp_is_ejectable() to check
> few more devices down from the root port.

We found that this approach is broken. We've got false positive: host
bridge itself was detected as hotplugable slot %) I think it's not
acceptable.

Mika has tried few more approaches, but we haven't found anything better
then hardcoded path like in original workaround patch[1]. It's not generic
at all, but safe from false positives.

Any thoughts?

[1] http://article.gmane.org/gmane.linux.kernel.pci/19102

--
Kirill A. Shutemov

2013-07-02 16:41:05

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Mon, Jul 1, 2013 at 7:29 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Monday, July 01, 2013 09:36:13 PM Mika Westerberg wrote:
>> On Mon, Jul 01, 2013 at 04:01:37PM +0200, Rafael J. Wysocki wrote:
>> > > Given the fact that SLOT_ENABLED is only checked in acpiphp_enable_slot()
>> > > (after this patch) and that /sys/bus/pci/slots/*/power uses SLOT_POWEREDON
>> > > anyway, should we remove the whole flag?
>> >
>> > Sure, if it is not necessary any more, we should remove it.
>>
>> Well, there is one thing that changes due that. Once the flag is gone
>> userspace can do 'echo 1 > /sys/bus/pci/slots/*/power' several times and
>> the slot is always re-enumerated.
>>
>> If that is not acceptable we should probably move the SLOT_ENABLED check
>> closer to acpiphp_core:enable_device() and drop it from here, so that we
>> always re-enumerate on Bus Check event but userspace can only do enable
>> once (we still re-enumerate on Bus Check).
>
> Yes, that sounds like the right thing to do.

Is it actually a problem if we re-enumerate every time userspace does
'echo 1 > /sys/bus/pci/slots/*/power'? I assume re-enumeration is a
no-op if nothing has changed.

Bjorn

2013-07-02 17:09:42

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

On Tue, Jul 2, 2013 at 4:44 AM, Kirill A. Shutemov
<[email protected]> wrote:
> Mika Westerberg wrote:
>> The acpiphp driver finds out whether the device is hotpluggable by checking
>> whether it has _RMV method behind it (and if it returns 1). However, at
>> least Acer Aspire S5 with Thunderbolt host router has this method placed
>> behind device called EPUP (endpoint upstream port?) and not directly behind
>> the root port as can be seen from the ASL code below:
>>
>> Device (RP05)
>> {
>> ...
>> Device (HRUP)
>> {
>> Name (_ADR, Zero)
>> Name (_PRW, Package (0x02)
>> {
>> 0x09,
>> 0x04
>> })
>> Device (HRDN)
>> {
>> Name (_ADR, 0x00040000)
>> Name (_PRW, Package (0x02)
>> {
>> 0x09,
>> 0x04
>> })
>> Device (EPUP)
>> {
>> Name (_ADR, Zero)
>> Method (_RMV, 0, NotSerialized)
>> {
>> Return (One)
>> }
>> }
>> }
>> }
>>
>> If we want to support such machines we must look for the _RMV method a bit
>> deeper in the hierarchy. Fix this by changing pcihp_is_ejectable() to check
>> few more devices down from the root port.
>
> We found that this approach is broken. We've got false positive: host
> bridge itself was detected as hotplugable slot %) I think it's not
> acceptable.
>
> Mika has tried few more approaches, but we haven't found anything better
> then hardcoded path like in original workaround patch[1]. It's not generic
> at all, but safe from false positives.
>
> Any thoughts?
>
> [1] http://article.gmane.org/gmane.linux.kernel.pci/19102

The changelog of this patch ([1]) says "Correct ACPI PCI hotplug
implementation should have _RMV method in a PCI slot (device under pci
bridge). In Acer Aspire S5 case we have it deeper in hierarchy ..."

This is exactly what I mean about acpiphp being brittle because of the
assumptions it makes about the ACPI namespace. Is there actually
something in the spec that requires the _RMV method to be where
pcihp_is_ejectable() expects it?

I'm not 100% dead-set against merging the workaround with hard-coded
path, but I still don't think it's a good idea. It "fixes" it for one
particular machine, but there will likely be other machines that
require similar fixes in the future. It makes it harder for somebody
to clean up the design later, because that person won't have an Aspire
S5 to test. It makes it less likely that somebody *will* clean it up
later, because "everything is already working."

That's why my preference (given infinite resources) would be to rework
acpiphp now, while people are interested and can test it. I'm sure
this would be a major redesign of acpiphp and its interaction with
pciehp, but I think it's something we're going to have to do at some
point.

Bjorn

2013-07-02 17:42:52

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

Bjorn Helgaas wrote:
> On Tue, Jul 2, 2013 at 4:44 AM, Kirill A. Shutemov
> <[email protected]> wrote:
> > Mika Westerberg wrote:
> >> The acpiphp driver finds out whether the device is hotpluggable by checking
> >> whether it has _RMV method behind it (and if it returns 1). However, at
> >> least Acer Aspire S5 with Thunderbolt host router has this method placed
> >> behind device called EPUP (endpoint upstream port?) and not directly behind
> >> the root port as can be seen from the ASL code below:
> >>
> >> Device (RP05)
> >> {
> >> ...
> >> Device (HRUP)
> >> {
> >> Name (_ADR, Zero)
> >> Name (_PRW, Package (0x02)
> >> {
> >> 0x09,
> >> 0x04
> >> })
> >> Device (HRDN)
> >> {
> >> Name (_ADR, 0x00040000)
> >> Name (_PRW, Package (0x02)
> >> {
> >> 0x09,
> >> 0x04
> >> })
> >> Device (EPUP)
> >> {
> >> Name (_ADR, Zero)
> >> Method (_RMV, 0, NotSerialized)
> >> {
> >> Return (One)
> >> }
> >> }
> >> }
> >> }
> >>
> >> If we want to support such machines we must look for the _RMV method a bit
> >> deeper in the hierarchy. Fix this by changing pcihp_is_ejectable() to check
> >> few more devices down from the root port.
> >
> > We found that this approach is broken. We've got false positive: host
> > bridge itself was detected as hotplugable slot %) I think it's not
> > acceptable.
> >
> > Mika has tried few more approaches, but we haven't found anything better
> > then hardcoded path like in original workaround patch[1]. It's not generic
> > at all, but safe from false positives.
> >
> > Any thoughts?
> >
> > [1] http://article.gmane.org/gmane.linux.kernel.pci/19102
>
> The changelog of this patch ([1]) says "Correct ACPI PCI hotplug
> implementation should have _RMV method in a PCI slot (device under pci
> bridge). In Acer Aspire S5 case we have it deeper in hierarchy ..."
>
> This is exactly what I mean about acpiphp being brittle because of the
> assumptions it makes about the ACPI namespace. Is there actually
> something in the spec that requires the _RMV method to be where
> pcihp_is_ejectable() expects it?

Spec says that _RMV indicates the device which can be removed.
Rest, I believe, are assumptions: if the device is removable, then parent
must be a hotpluggable slot.

It looks reasonable, but doesn't work in this case. And it's not obvious
how we can generalize the assumption to cover the case.

> I'm not 100% dead-set against merging the workaround with hard-coded
> path, but I still don't think it's a good idea. It "fixes" it for one
> particular machine, but there will likely be other machines that
> require similar fixes in the future. It makes it harder for somebody
> to clean up the design later, because that person won't have an Aspire
> S5 to test. It makes it less likely that somebody *will* clean it up
> later, because "everything is already working."
>
> That's why my preference (given infinite resources) would be to rework
> acpiphp now, while people are interested and can test it. I'm sure
> this would be a major redesign of acpiphp and its interaction with
> pciehp, but I think it's something we're going to have to do at some
> point.

Basically, we run out of ideas. Any input is welcome! :)

For now we can post new version of patchset without the workaround and
come back later with workaround (or proper solution if we'll find any).

Does it work for you? Or you prefer everything in one patchset?

--
Kirill A. Shutemov

2013-07-02 20:19:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Tuesday, July 02, 2013 10:40:39 AM Bjorn Helgaas wrote:
> On Mon, Jul 1, 2013 at 7:29 PM, Rafael J. Wysocki <[email protected]> wrote:
> > On Monday, July 01, 2013 09:36:13 PM Mika Westerberg wrote:
> >> On Mon, Jul 01, 2013 at 04:01:37PM +0200, Rafael J. Wysocki wrote:
> >> > > Given the fact that SLOT_ENABLED is only checked in acpiphp_enable_slot()
> >> > > (after this patch) and that /sys/bus/pci/slots/*/power uses SLOT_POWEREDON
> >> > > anyway, should we remove the whole flag?
> >> >
> >> > Sure, if it is not necessary any more, we should remove it.
> >>
> >> Well, there is one thing that changes due that. Once the flag is gone
> >> userspace can do 'echo 1 > /sys/bus/pci/slots/*/power' several times and
> >> the slot is always re-enumerated.
> >>
> >> If that is not acceptable we should probably move the SLOT_ENABLED check
> >> closer to acpiphp_core:enable_device() and drop it from here, so that we
> >> always re-enumerate on Bus Check event but userspace can only do enable
> >> once (we still re-enumerate on Bus Check).
> >
> > Yes, that sounds like the right thing to do.
>
> Is it actually a problem if we re-enumerate every time userspace does
> 'echo 1 > /sys/bus/pci/slots/*/power'? I assume re-enumeration is a
> no-op if nothing has changed.

Well, if it's a no-op in that case, then re-enumerating shouldn't be a problem,
but is it a no-op?

Rafael (who's not really sure)


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-07-02 20:27:23

by Mika Westerberg

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Tue, Jul 02, 2013 at 10:29:12PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, July 02, 2013 10:40:39 AM Bjorn Helgaas wrote:
> > On Mon, Jul 1, 2013 at 7:29 PM, Rafael J. Wysocki <[email protected]> wrote:
> > > On Monday, July 01, 2013 09:36:13 PM Mika Westerberg wrote:
> > >> On Mon, Jul 01, 2013 at 04:01:37PM +0200, Rafael J. Wysocki wrote:
> > >> > > Given the fact that SLOT_ENABLED is only checked in acpiphp_enable_slot()
> > >> > > (after this patch) and that /sys/bus/pci/slots/*/power uses SLOT_POWEREDON
> > >> > > anyway, should we remove the whole flag?
> > >> >
> > >> > Sure, if it is not necessary any more, we should remove it.
> > >>
> > >> Well, there is one thing that changes due that. Once the flag is gone
> > >> userspace can do 'echo 1 > /sys/bus/pci/slots/*/power' several times and
> > >> the slot is always re-enumerated.
> > >>
> > >> If that is not acceptable we should probably move the SLOT_ENABLED check
> > >> closer to acpiphp_core:enable_device() and drop it from here, so that we
> > >> always re-enumerate on Bus Check event but userspace can only do enable
> > >> once (we still re-enumerate on Bus Check).
> > >
> > > Yes, that sounds like the right thing to do.
> >
> > Is it actually a problem if we re-enumerate every time userspace does
> > 'echo 1 > /sys/bus/pci/slots/*/power'? I assume re-enumeration is a
> > no-op if nothing has changed.
>
> Well, if it's a no-op in that case, then re-enumerating shouldn't be a problem,
> but is it a no-op?

I can confirm that it's a no-op (at least for the Thunderbolt case).
Basically we just scan for new devices and nothing is to be found.

2013-07-02 20:31:22

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

On Tuesday, July 02, 2013 11:09:19 AM Bjorn Helgaas wrote:
> On Tue, Jul 2, 2013 at 4:44 AM, Kirill A. Shutemov
> <[email protected]> wrote:
> > Mika Westerberg wrote:
> >> The acpiphp driver finds out whether the device is hotpluggable by checking
> >> whether it has _RMV method behind it (and if it returns 1). However, at
> >> least Acer Aspire S5 with Thunderbolt host router has this method placed
> >> behind device called EPUP (endpoint upstream port?) and not directly behind
> >> the root port as can be seen from the ASL code below:
> >>
> >> Device (RP05)
> >> {
> >> ...
> >> Device (HRUP)
> >> {
> >> Name (_ADR, Zero)
> >> Name (_PRW, Package (0x02)
> >> {
> >> 0x09,
> >> 0x04
> >> })
> >> Device (HRDN)
> >> {
> >> Name (_ADR, 0x00040000)
> >> Name (_PRW, Package (0x02)
> >> {
> >> 0x09,
> >> 0x04
> >> })
> >> Device (EPUP)
> >> {
> >> Name (_ADR, Zero)
> >> Method (_RMV, 0, NotSerialized)
> >> {
> >> Return (One)
> >> }
> >> }
> >> }
> >> }
> >>
> >> If we want to support such machines we must look for the _RMV method a bit
> >> deeper in the hierarchy. Fix this by changing pcihp_is_ejectable() to check
> >> few more devices down from the root port.
> >
> > We found that this approach is broken. We've got false positive: host
> > bridge itself was detected as hotplugable slot %) I think it's not
> > acceptable.
> >
> > Mika has tried few more approaches, but we haven't found anything better
> > then hardcoded path like in original workaround patch[1]. It's not generic
> > at all, but safe from false positives.
> >
> > Any thoughts?
> >
> > [1] http://article.gmane.org/gmane.linux.kernel.pci/19102
>
> The changelog of this patch ([1]) says "Correct ACPI PCI hotplug
> implementation should have _RMV method in a PCI slot (device under pci
> bridge). In Acer Aspire S5 case we have it deeper in hierarchy ..."
>
> This is exactly what I mean about acpiphp being brittle because of the
> assumptions it makes about the ACPI namespace. Is there actually
> something in the spec that requires the _RMV method to be where
> pcihp_is_ejectable() expects it?
>
> I'm not 100% dead-set against merging the workaround with hard-coded
> path, but I still don't think it's a good idea. It "fixes" it for one
> particular machine, but there will likely be other machines that
> require similar fixes in the future. It makes it harder for somebody
> to clean up the design later, because that person won't have an Aspire
> S5 to test. It makes it less likely that somebody *will* clean it up
> later, because "everything is already working."
>
> That's why my preference (given infinite resources) would be to rework
> acpiphp now, while people are interested and can test it.

Well, I have a plan to do just that, but quite frankly I'd prefer to make
things physically work first and *then* to make them nicer.

> I'm sure this would be a major redesign of acpiphp and its interaction with
> pciehp, but I think it's something we're going to have to do at some
> point.

Yes, we need to do that pretty much as soon as reasonably possible, but let's
not hold back changes allowing people to use their hardware with Linux because
of that. We very well may need to spend a couple of releases on it.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-07-02 20:39:02

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 5/6] PCI: acpiphp: look _RMV method a bit deeper in the hierarhcy

On Tuesday, July 02, 2013 08:45:42 PM Kirill A. Shutemov wrote:
> Bjorn Helgaas wrote:
> > On Tue, Jul 2, 2013 at 4:44 AM, Kirill A. Shutemov
> > <[email protected]> wrote:
> > > Mika Westerberg wrote:
> > >> The acpiphp driver finds out whether the device is hotpluggable by checking
> > >> whether it has _RMV method behind it (and if it returns 1). However, at
> > >> least Acer Aspire S5 with Thunderbolt host router has this method placed
> > >> behind device called EPUP (endpoint upstream port?) and not directly behind
> > >> the root port as can be seen from the ASL code below:
> > >>
> > >> Device (RP05)
> > >> {
> > >> ...
> > >> Device (HRUP)
> > >> {
> > >> Name (_ADR, Zero)
> > >> Name (_PRW, Package (0x02)
> > >> {
> > >> 0x09,
> > >> 0x04
> > >> })
> > >> Device (HRDN)
> > >> {
> > >> Name (_ADR, 0x00040000)
> > >> Name (_PRW, Package (0x02)
> > >> {
> > >> 0x09,
> > >> 0x04
> > >> })
> > >> Device (EPUP)
> > >> {
> > >> Name (_ADR, Zero)
> > >> Method (_RMV, 0, NotSerialized)
> > >> {
> > >> Return (One)
> > >> }
> > >> }
> > >> }
> > >> }
> > >>
> > >> If we want to support such machines we must look for the _RMV method a bit
> > >> deeper in the hierarchy. Fix this by changing pcihp_is_ejectable() to check
> > >> few more devices down from the root port.
> > >
> > > We found that this approach is broken. We've got false positive: host
> > > bridge itself was detected as hotplugable slot %) I think it's not
> > > acceptable.
> > >
> > > Mika has tried few more approaches, but we haven't found anything better
> > > then hardcoded path like in original workaround patch[1]. It's not generic
> > > at all, but safe from false positives.
> > >
> > > Any thoughts?
> > >
> > > [1] http://article.gmane.org/gmane.linux.kernel.pci/19102
> >
> > The changelog of this patch ([1]) says "Correct ACPI PCI hotplug
> > implementation should have _RMV method in a PCI slot (device under pci
> > bridge). In Acer Aspire S5 case we have it deeper in hierarchy ..."
> >
> > This is exactly what I mean about acpiphp being brittle because of the
> > assumptions it makes about the ACPI namespace. Is there actually
> > something in the spec that requires the _RMV method to be where
> > pcihp_is_ejectable() expects it?
>
> Spec says that _RMV indicates the device which can be removed.
> Rest, I believe, are assumptions: if the device is removable, then parent
> must be a hotpluggable slot.

Where "hotpluggable slot" is a very weakly defined entity. :-)

_RMV means: this device only supports surprise-style removal. Moreover, it
is mandatory for removable devices without _LCK or _EJx. _RMV on one device
doesn't actually mean anything for other devices.

> It looks reasonable, but doesn't work in this case. And it's not obvious
> how we can generalize the assumption to cover the case.
>
> > I'm not 100% dead-set against merging the workaround with hard-coded
> > path, but I still don't think it's a good idea. It "fixes" it for one
> > particular machine, but there will likely be other machines that
> > require similar fixes in the future. It makes it harder for somebody
> > to clean up the design later, because that person won't have an Aspire
> > S5 to test. It makes it less likely that somebody *will* clean it up
> > later, because "everything is already working."
> >
> > That's why my preference (given infinite resources) would be to rework
> > acpiphp now, while people are interested and can test it. I'm sure
> > this would be a major redesign of acpiphp and its interaction with
> > pciehp, but I think it's something we're going to have to do at some
> > point.
>
> Basically, we run out of ideas. Any input is welcome! :)
>
> For now we can post new version of patchset without the workaround and
> come back later with workaround (or proper solution if we'll find any).
>
> Does it work for you? Or you prefer everything in one patchset?

Please post the current version of the patchset with as much of the feedback
taken into account as possible without breaking functionality.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-07-02 20:40:14

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 1/6] PCI: acpiphp: do not check for SLOT_ENABLED in enable_device()

On Tuesday, July 02, 2013 11:31:17 PM Mika Westerberg wrote:
> On Tue, Jul 02, 2013 at 10:29:12PM +0200, Rafael J. Wysocki wrote:
> > On Tuesday, July 02, 2013 10:40:39 AM Bjorn Helgaas wrote:
> > > On Mon, Jul 1, 2013 at 7:29 PM, Rafael J. Wysocki <[email protected]> wrote:
> > > > On Monday, July 01, 2013 09:36:13 PM Mika Westerberg wrote:
> > > >> On Mon, Jul 01, 2013 at 04:01:37PM +0200, Rafael J. Wysocki wrote:
> > > >> > > Given the fact that SLOT_ENABLED is only checked in acpiphp_enable_slot()
> > > >> > > (after this patch) and that /sys/bus/pci/slots/*/power uses SLOT_POWEREDON
> > > >> > > anyway, should we remove the whole flag?
> > > >> >
> > > >> > Sure, if it is not necessary any more, we should remove it.
> > > >>
> > > >> Well, there is one thing that changes due that. Once the flag is gone
> > > >> userspace can do 'echo 1 > /sys/bus/pci/slots/*/power' several times and
> > > >> the slot is always re-enumerated.
> > > >>
> > > >> If that is not acceptable we should probably move the SLOT_ENABLED check
> > > >> closer to acpiphp_core:enable_device() and drop it from here, so that we
> > > >> always re-enumerate on Bus Check event but userspace can only do enable
> > > >> once (we still re-enumerate on Bus Check).
> > > >
> > > > Yes, that sounds like the right thing to do.
> > >
> > > Is it actually a problem if we re-enumerate every time userspace does
> > > 'echo 1 > /sys/bus/pci/slots/*/power'? I assume re-enumeration is a
> > > no-op if nothing has changed.
> >
> > Well, if it's a no-op in that case, then re-enumerating shouldn't be a problem,
> > but is it a no-op?
>
> I can confirm that it's a no-op (at least for the Thunderbolt case).
> Basically we just scan for new devices and nothing is to be found.

We can try to drop SLOT_ENABLED entirely then I suppose.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.