2013-10-10 21:18:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thu, Oct 10, 2013 at 1:59 PM, Steven Rostedt <[email protected]> wrote:
>
> Unfortunately it's not a single commit. I tried to bisect it better,
> but it really is the merge. I'm thinking a chance in one merge caused
> the WARN_ON() to trigger that was added in a later merge.

That looks a bit unlikely. The whole acpiphp_init/get/put_context()
logic is entirely internal to drivers/pci/hotplug/acpiphp_glue.c, and
that merge gets absolutely all the changes from one side, except for a
one-liner change to that file that looks entirely unrelated (commit
928bea964827 "PCI: Delay enabling bridges until they're needed").

That said, that one commit did cause other problems (see commit
f41f064cf435: "PCI: Workaround missing pci_set_master in pci
drivers"), so who knows. Some subtle interaction with exactly when the
hotplug functions end up being called?

But it could possibly be timing-related too. Does everything behind
that hp bridge still work?

Linus


2013-10-10 21:24:12

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thursday, October 10, 2013 02:17:58 PM Linus Torvalds wrote:
> On Thu, Oct 10, 2013 at 1:59 PM, Steven Rostedt <[email protected]> wrote:
> >
> > Unfortunately it's not a single commit. I tried to bisect it better,
> > but it really is the merge. I'm thinking a chance in one merge caused
> > the WARN_ON() to trigger that was added in a later merge.

Well, I must have overlooked the original report. Is it available anywhere
I can find it?

Rafael

2013-10-10 21:37:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thu, Oct 10, 2013 at 2:35 PM, Rafael J. Wysocki <[email protected]> wrote:
>
> Well, I must have overlooked the original report. Is it available anywhere
> I can find it?

I think Steven has some buggered email system, he has other emails
being eaten by lkml too, and apparently other mail gateways (because
you were direct-cc'd on the original).

His email sender doesn't quote names with "," in them, and has headers
like this:

From: Steven Rostedt <[email protected]>
To: LKML <[email protected]>
Cc: Linus Torvalds <[email protected]>, Rafael J. Wysocki
<[email protected]>, Mika Westerberg
<[email protected]>, Bjorn Helgaas <[email protected]>,
Andrew Morton <[email protected]>

which is apparently against SMTP rules. The magic line is:

X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.20; x86_64-pc-linux-gnu)

The tag-line for that mailer is quite appropriate: "The email reader
that bites". That's what they put in the title on their web-page.

Because it sure bites. Except the claws people seemed to think that
was supposed to be a good thing. They clearly don't know the slang
meaning of "that bites".

Or maybe they do, and they are just unusually self-aware.

Steven, I'd suggest just jettisoning that mailer.

Linus

2013-10-10 22:31:17

by Jonathan Corbet

[permalink] [raw]
Subject: Re: Claws (was [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c)

On Thu, 10 Oct 2013 14:37:15 -0700
Linus Torvalds <[email protected]> wrote:

> Steven, I'd suggest just jettisoning that mailer.

FWIW, I've used claws for years and never run into an issue like that.
This email comes from claws; the quoting looks right in the compose window
now, I assume it will go out in good form. Otherwise I'll have just made
a fool of myself in front of everybody, but, then, that's what I do for a
living, so that's OK too.

The problem, from what Steve said before, had to do with the address
book. I assume there's a real bug there...I bet we could get it fixed.
They've taken patches from me without grief, and are generally responsive
to this sort of thing. I'll forward things over to the claws list and see
what they say.

jon

2013-10-10 22:57:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thursday, October 10, 2013 02:37:15 PM Linus Torvalds wrote:
> On Thu, Oct 10, 2013 at 2:35 PM, Rafael J. Wysocki <[email protected]> wrote:
> >
> > Well, I must have overlooked the original report. Is it available anywhere
> > I can find it?
>
> I think Steven has some buggered email system, he has other emails
> being eaten by lkml too, and apparently other mail gateways (because
> you were direct-cc'd on the original).

Mailer issues aside, I've just seen the original (Bjorn forwarded it to me,
thanks!) and I'm wondering if the message added by the debug patch below is
triggered along with the WARN_ON(). If it is, I think it's better to drop
the WARN_ON(), at least for now (until we sort out the acpiphp/pciehp
coexistence).

Rafael


---
drivers/pci/hotplug/acpiphp_glue.c | 5 +++++
1 file changed, 5 insertions(+)

Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
@@ -991,6 +991,11 @@ void acpiphp_enumerate_slots(struct pci_

if (!pci_is_root_bus(bridge->pci_bus)) {
struct acpiphp_context *context;
+ struct pci_dev *parent = bridge->pci_bus->parent->self;
+
+ if (parent && device_is_managed_by_native_pciehp(parent))
+ dev_warn(&bridge->pci_dev->dev,
+ "Parent is managed by pciehp!\n");

/*
* This bridge should have been registered as a hotplug function


2013-10-11 00:19:28

by Steven Rostedt

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thu, 10 Oct 2013 14:37:15 -0700
Linus Torvalds <[email protected]> wrote:

> On Thu, Oct 10, 2013 at 2:35 PM, Rafael J. Wysocki <[email protected]> wrote:
> >
> > Well, I must have overlooked the original report. Is it available anywhere
> > I can find it?
>
> I think Steven has some buggered email system, he has other emails
> being eaten by lkml too, and apparently other mail gateways (because
> you were direct-cc'd on the original).
>
> His email sender doesn't quote names with "," in them, and has headers

I think the issue is with "." not ","

> like this:
>
> From: Steven Rostedt <[email protected]>
> To: LKML <[email protected]>
> Cc: Linus Torvalds <[email protected]>, Rafael J. Wysocki

This may have been my fault, as I cut and pasted Rafael's email from
the git log, and did not add the quotes myself.

Perhaps claws should force names with '.' to be quoted. I don't
remember having this issue with Evolution (I switched to claws a couple
of months ago).


> <[email protected]>, Mika Westerberg
> <[email protected]>, Bjorn Helgaas <[email protected]>,
> Andrew Morton <[email protected]>
>
> which is apparently against SMTP rules. The magic line is:
>
> X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.20; x86_64-pc-linux-gnu)
>
> The tag-line for that mailer is quite appropriate: "The email reader
> that bites". That's what they put in the title on their web-page.
>
> Because it sure bites. Except the claws people seemed to think that
> was supposed to be a good thing. They clearly don't know the slang
> meaning of "that bites".
>
> Or maybe they do, and they are just unusually self-aware.
>
> Steven, I'd suggest just jettisoning that mailer.

The first time it was from the address book, as when I added it from an
email, claws stripped out the quotes when it added it. This time it was
just me cut and pasting a name with a '.' without adding quotes myself.

-- Steve

2013-10-11 01:00:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thu, Oct 10, 2013 at 5:19 PM, Steven Rostedt <[email protected]> wrote:
>
> This may have been my fault, as I cut and pasted Rafael's email from
> the git log, and did not add the quotes myself.

Well, it is your fault, but only because you use a buggy mailer from hell.

> Perhaps claws should force names with '.' to be quoted. I don't
> remember having this issue with Evolution (I switched to claws a couple
> of months ago).

There is no "perhaps" about this. It's clearly a Claws bug.

When you send email to Amaury Decrême (to pick a kernel email address
at random with special characters) do you think you should write his
email address as

=?UTF-8?q?Amaury=20Decr=C3=AAme?= <[email protected]>?

No you should not. For similar reasons, any email program that expects
you to quote dots in names is pure and utter garbage. The fact that
SMTP expects dots to be quoted has absolutely zero bearing on
anything, the same way it has zero bearing that SMTP headers should
use even odder quoting rules for other "special" characters.

File a bug on Claws, and if the developers brush it off as your own
problem, just stop using the PoS.

I realize that you seem to have this self-harming habit - first
evolution, now claws - but it's like cutting or anorexia. We're having
an intervention here, and the first step is to realize you have a
problem.

Linus

2013-10-11 01:07:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thu, Oct 10, 2013 at 6:06 PM, Joe Perches <[email protected]> wrote:
> On Thu, 2013-10-10 at 18:00 -0700, Linus Torvalds wrote:
>> the first step is to realize you have a problem.
>
> Says the guy that can only send patches to the list
> as attachments not plain text because he uses gmail.

Yeah. I fire up pine every once in a while if I have more than one
patch to send, but touché.

Linus

2013-10-11 01:07:31

by Joe Perches

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thu, 2013-10-10 at 18:00 -0700, Linus Torvalds wrote:
> the first step is to realize you have a problem.

Says the guy that can only send patches to the list
as attachments not plain text because he uses gmail.

2013-10-11 01:33:51

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Friday, October 11, 2013 01:09:33 AM Rafael J. Wysocki wrote:
> On Thursday, October 10, 2013 02:37:15 PM Linus Torvalds wrote:
> > On Thu, Oct 10, 2013 at 2:35 PM, Rafael J. Wysocki <[email protected]> wrote:
> > >
> > > Well, I must have overlooked the original report. Is it available anywhere
> > > I can find it?
> >
> > I think Steven has some buggered email system, he has other emails
> > being eaten by lkml too, and apparently other mail gateways (because
> > you were direct-cc'd on the original).
>
> Mailer issues aside, I've just seen the original (Bjorn forwarded it to me,
> thanks!) and I'm wondering if the message added by the debug patch below is
> triggered along with the WARN_ON(). If it is, I think it's better to drop
> the WARN_ON(), at least for now (until we sort out the acpiphp/pciehp
> coexistence).
>
> ---
> drivers/pci/hotplug/acpiphp_glue.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
> +++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
> @@ -991,6 +991,11 @@ void acpiphp_enumerate_slots(struct pci_
>
> if (!pci_is_root_bus(bridge->pci_bus)) {
> struct acpiphp_context *context;
> + struct pci_dev *parent = bridge->pci_bus->parent->self;
> +
> + if (parent && device_is_managed_by_native_pciehp(parent))
> + dev_warn(&bridge->pci_dev->dev,
> + "Parent is managed by pciehp!\n");
>
> /*
> * This bridge should have been registered as a hotplug function
>

Steve told me over IRC that the message added by the above triggered along
with the WARN_ON(), so this really was the issue I had in mind. So, I asked
Steve to test the appended patch and it worked for him.

Well, I admit this is a gray area, because we've never tried it, but I think
we'll need to try it at one point anyway and see how it goes, so why don't we
actually do that now?

If it turns out to cause problems to happen for people, we can simply revert
it and remove the WARN_ON() instead. And then we'll know that this is
problematic.

What do you think?

Rafael


---
From: Rafael J. Wysocki <[email protected]>
Subject: ACPI / hotplug / PCI: Accept coexistence with native PCIe hotplug

Allow ACPIPHP (ACPI-based PCI hotplug) to handle event signaling for
devices that have already been claimed by the native PCIe hotplug
(pciehp).

This change prevents the WARN_ON() in acpiphp_enumerate_slots() from
triggering unnecessarily for bridges whose parents are managed by
pciehp.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/hotplug/acpiphp_glue.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
@@ -274,9 +274,6 @@ static acpi_status register_slot(acpi_ha
struct pci_dev *pdev = bridge->pci_dev;
u32 val;

- if (pdev && device_is_managed_by_native_pciehp(pdev))
- return AE_OK;
-
status = acpi_evaluate_integer(handle, "_ADR", NULL, &adr);
if (ACPI_FAILURE(status)) {
acpi_handle_warn(handle, "can't evaluate _ADR (%#x)\n", status);
@@ -326,7 +323,8 @@ static acpi_status register_slot(acpi_ha
list_add_tail(&slot->node, &bridge->slots);

/* Register slots for ejectable funtions only. */
- if (acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle)) {
+ if ((acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle))
+ && !(pdev && device_is_managed_by_native_pciehp(pdev))) {
unsigned long long sun;
int retval;

2013-10-11 01:43:00

by Linus Torvalds

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thu, Oct 10, 2013 at 6:45 PM, Rafael J. Wysocki <[email protected]> wrote:
>
> /* Register slots for ejectable funtions only. */
> - if (acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle)) {
> + if ((acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle))
> + && !(pdev && device_is_managed_by_native_pciehp(pdev))) {
> unsigned long long sun;
> int retval;

I can't even begin to say whether this is a good solution or not,
because that if-conditional makes me want to go out and kill some
homeless people to let my aggressions out.

Can we please agree to *never* write code like this? Ever?

Use a well-named inline helper function where the name describes what
the f*ck the code is trying to do, and then comment the separate
issues. Because none of the above line noise makes me go "Ahh, it's
the test for an ejectable function".

What the heck _is_ an "ejectable function" anyway? The only comment
there just makes the code even less sensible.

Please?

Linus

2013-10-11 11:02:08

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thursday, October 10, 2013 06:42:56 PM Linus Torvalds wrote:
> On Thu, Oct 10, 2013 at 6:45 PM, Rafael J. Wysocki <[email protected]> wrote:
> >
> > /* Register slots for ejectable funtions only. */
> > - if (acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle)) {
> > + if ((acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle))
> > + && !(pdev && device_is_managed_by_native_pciehp(pdev))) {
> > unsigned long long sun;
> > int retval;
>
> I can't even begin to say whether this is a good solution or not,
> because that if-conditional makes me want to go out and kill some
> homeless people to let my aggressions out.
>
> Can we please agree to *never* write code like this? Ever?
>
> Use a well-named inline helper function where the name describes what
> the f*ck the code is trying to do, and then comment the separate
> issues. Because none of the above line noise makes me go "Ahh, it's
> the test for an ejectable function".
>
> What the heck _is_ an "ejectable function" anyway? The only comment
> there just makes the code even less sensible.
>
> Please?

From: Rafael J. Wysocki <[email protected]>
Subject: ACPI / hotplug / PCI: Accept coexistence with native PCIe hotplug

Allow ACPIPHP (ACPI-based PCI hotplug) to handle event signaling for
devices that have already been claimed by the native PCIe hotplug
(pciehp).

The ACPI hotplug events are essentially re-scan, remove and eject
requests. Re-scan and remove should work regardless, because they
may be triggered by user space via sysfs and the ACPI eject (_EJ0)
should work if the BIOS wants us to use it. There may be an issue
if the BIOS signals ACPI eject and wants us to use the native eject,
but that doesn't work without this change anyway.

This change prevents the WARN_ON() in acpiphp_enumerate_slots() from
triggering unnecessarily for bridges whose parents are managed by
pciehp.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/hotplug/acpiphp_glue.c | 32 ++++++++++++++++++++++++++------
1 file changed, 26 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
@@ -259,6 +259,31 @@ static void acpiphp_dock_release(void *d
put_bridge(context->func.parent);
}

+/**
+ * slot_should_be_exposed - Check whether or not to expose a slot to userland.
+ * @bridge: ACPIPHP bridge the slot belongs to.
+ * @handle: ACPI handle of a device in the slot.
+ */
+static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
+ acpi_handle handle)
+{
+ struct pci_bus *pbus = bridge->pci_bus;
+ struct pci_dev *pdev = bridge->pci_dev;
+
+ /*
+ * Do not expose slots whose bridges are managed by pciehp, because they
+ * will be exposed to user space by the pciehp driver.
+ */
+ if (pdev && device_is_managed_by_native_pciehp(pdev))
+ return false;
+
+ /*
+ * Expose slots for devices with either _EJ0 or _RMV and for devices
+ * on docking stations.
+ */
+ return acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle);
+}
+
/* callback routine to register each ACPI PCI slot object */
static acpi_status register_slot(acpi_handle handle, u32 lvl, void *data,
void **rv)
@@ -271,12 +296,8 @@ static acpi_status register_slot(acpi_ha
unsigned long long adr;
int device, function;
struct pci_bus *pbus = bridge->pci_bus;
- struct pci_dev *pdev = bridge->pci_dev;
u32 val;

- if (pdev && device_is_managed_by_native_pciehp(pdev))
- return AE_OK;
-
status = acpi_evaluate_integer(handle, "_ADR", NULL, &adr);
if (ACPI_FAILURE(status)) {
acpi_handle_warn(handle, "can't evaluate _ADR (%#x)\n", status);
@@ -325,8 +346,7 @@ static acpi_status register_slot(acpi_ha

list_add_tail(&slot->node, &bridge->slots);

- /* Register slots for ejectable funtions only. */
- if (acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle)) {
+ if (slot_should_be_exposed(bridge, handle)) {
unsigned long long sun;
int retval;

2013-10-11 13:01:56

by Steven Rostedt

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Fri, 11 Oct 2013 13:13:47 +0200
"Rafael J. Wysocki" <[email protected]> wrote:
>
> From: Rafael J. Wysocki <[email protected]>
> Subject: ACPI / hotplug / PCI: Accept coexistence with native PCIe hotplug
>
> Allow ACPIPHP (ACPI-based PCI hotplug) to handle event signaling for
> devices that have already been claimed by the native PCIe hotplug
> (pciehp).
>
> The ACPI hotplug events are essentially re-scan, remove and eject
> requests. Re-scan and remove should work regardless, because they
> may be triggered by user space via sysfs and the ACPI eject (_EJ0)
> should work if the BIOS wants us to use it. There may be an issue
> if the BIOS signals ACPI eject and wants us to use the native eject,
> but that doesn't work without this change anyway.
>
> This change prevents the WARN_ON() in acpiphp_enumerate_slots() from
> triggering unnecessarily for bridges whose parents are managed by
> pciehp.
>

Reported-by: Steven Rostedt <[email protected]>
Tested-by: Steven Rostedt <[email protected]>

-- Steve

> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> drivers/pci/hotplug/acpiphp_glue.c | 32 ++++++++++++++++++++++++++------
> 1 file changed, 26 insertions(+), 6 deletions(-)
>
> Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
> +++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
> @@ -259,6 +259,31 @@ static void acpiphp_dock_release(void *d
> put_bridge(context->func.parent);
> }
>
> +/**
> + * slot_should_be_exposed - Check whether or not to expose a slot to userland.
> + * @bridge: ACPIPHP bridge the slot belongs to.
> + * @handle: ACPI handle of a device in the slot.
> + */
> +static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
> + acpi_handle handle)
> +{
> + struct pci_bus *pbus = bridge->pci_bus;
> + struct pci_dev *pdev = bridge->pci_dev;
> +
> + /*
> + * Do not expose slots whose bridges are managed by pciehp, because they
> + * will be exposed to user space by the pciehp driver.
> + */
> + if (pdev && device_is_managed_by_native_pciehp(pdev))
> + return false;
> +
> + /*
> + * Expose slots for devices with either _EJ0 or _RMV and for devices
> + * on docking stations.
> + */
> + return acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle);
> +}
> +
> /* callback routine to register each ACPI PCI slot object */
> static acpi_status register_slot(acpi_handle handle, u32 lvl, void *data,
> void **rv)
> @@ -271,12 +296,8 @@ static acpi_status register_slot(acpi_ha
> unsigned long long adr;
> int device, function;
> struct pci_bus *pbus = bridge->pci_bus;
> - struct pci_dev *pdev = bridge->pci_dev;
> u32 val;
>
> - if (pdev && device_is_managed_by_native_pciehp(pdev))
> - return AE_OK;
> -
> status = acpi_evaluate_integer(handle, "_ADR", NULL, &adr);
> if (ACPI_FAILURE(status)) {
> acpi_handle_warn(handle, "can't evaluate _ADR (%#x)\n", status);
> @@ -325,8 +346,7 @@ static acpi_status register_slot(acpi_ha
>
> list_add_tail(&slot->node, &bridge->slots);
>
> - /* Register slots for ejectable funtions only. */
> - if (acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle)) {
> + if (slot_should_be_exposed(bridge, handle)) {
> unsigned long long sun;
> int retval;
>

2013-10-11 14:08:30

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Thu, 10 Oct 2013 18:00:48 -0700
Linus Torvalds <[email protected]> wrote:

> File a bug on Claws, and if the developers brush it off as your own
> problem, just stop using the PoS.

They acknowledge the bug - it was already known to them, actually. With
luck we'll see a fix soon.

jon

2013-10-11 15:09:04

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Fri, Oct 11, 2013 at 7:01 AM, Steven Rostedt <[email protected]> wrote:
> On Fri, 11 Oct 2013 13:13:47 +0200
> "Rafael J. Wysocki" <[email protected]> wrote:
>>
>> From: Rafael J. Wysocki <[email protected]>
>> Subject: ACPI / hotplug / PCI: Accept coexistence with native PCIe hotplug
>>
>> Allow ACPIPHP (ACPI-based PCI hotplug) to handle event signaling for
>> devices that have already been claimed by the native PCIe hotplug
>> (pciehp).
>>
>> The ACPI hotplug events are essentially re-scan, remove and eject
>> requests. Re-scan and remove should work regardless, because they
>> may be triggered by user space via sysfs and the ACPI eject (_EJ0)
>> should work if the BIOS wants us to use it. There may be an issue
>> if the BIOS signals ACPI eject and wants us to use the native eject,
>> but that doesn't work without this change anyway.
>>
>> This change prevents the WARN_ON() in acpiphp_enumerate_slots() from
>> triggering unnecessarily for bridges whose parents are managed by
>> pciehp.
>>
>
> Reported-by: Steven Rostedt <[email protected]>
> Tested-by: Steven Rostedt <[email protected]>

I opened https://bugzilla.kernel.org/show_bug.cgi?id=62831 for this
issue because I don't think this question of how to coordinate acpiphp
and pciehp is completely resolved, and I think it might be interesting
to have the complete dmesg, acpidump, and "lspci -vv" output attached
there for future reference.

Steve, I tried to attach the dmesg you mentioned yesterday in IRC, but
the link didn't work for me. Would you mind attaching this info?

Rafael, I assume you'll probably merge this through your tree. Would
you mind adding a reference to this bugzilla in the changelog? I do
have a "convert to dynamic debug" acpiphp patch in my "next" branch
(bd950799), but I suspect you have several more interesting ones in
your tree.

Bjorn

>> Signed-off-by: Rafael J. Wysocki <[email protected]>
>> ---
>> drivers/pci/hotplug/acpiphp_glue.c | 32 ++++++++++++++++++++++++++------
>> 1 file changed, 26 insertions(+), 6 deletions(-)
>>
>> Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
>> ===================================================================
>> --- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
>> +++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
>> @@ -259,6 +259,31 @@ static void acpiphp_dock_release(void *d
>> put_bridge(context->func.parent);
>> }
>>
>> +/**
>> + * slot_should_be_exposed - Check whether or not to expose a slot to userland.
>> + * @bridge: ACPIPHP bridge the slot belongs to.
>> + * @handle: ACPI handle of a device in the slot.
>> + */
>> +static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
>> + acpi_handle handle)
>> +{
>> + struct pci_bus *pbus = bridge->pci_bus;
>> + struct pci_dev *pdev = bridge->pci_dev;
>> +
>> + /*
>> + * Do not expose slots whose bridges are managed by pciehp, because they
>> + * will be exposed to user space by the pciehp driver.
>> + */
>> + if (pdev && device_is_managed_by_native_pciehp(pdev))
>> + return false;
>> +
>> + /*
>> + * Expose slots for devices with either _EJ0 or _RMV and for devices
>> + * on docking stations.
>> + */
>> + return acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle);
>> +}
>> +
>> /* callback routine to register each ACPI PCI slot object */
>> static acpi_status register_slot(acpi_handle handle, u32 lvl, void *data,
>> void **rv)
>> @@ -271,12 +296,8 @@ static acpi_status register_slot(acpi_ha
>> unsigned long long adr;
>> int device, function;
>> struct pci_bus *pbus = bridge->pci_bus;
>> - struct pci_dev *pdev = bridge->pci_dev;
>> u32 val;
>>
>> - if (pdev && device_is_managed_by_native_pciehp(pdev))
>> - return AE_OK;
>> -
>> status = acpi_evaluate_integer(handle, "_ADR", NULL, &adr);
>> if (ACPI_FAILURE(status)) {
>> acpi_handle_warn(handle, "can't evaluate _ADR (%#x)\n", status);
>> @@ -325,8 +346,7 @@ static acpi_status register_slot(acpi_ha
>>
>> list_add_tail(&slot->node, &bridge->slots);
>>
>> - /* Register slots for ejectable funtions only. */
>> - if (acpi_pci_check_ejectable(pbus, handle) || is_dock_device(handle)) {
>> + if (slot_should_be_exposed(bridge, handle)) {
>> unsigned long long sun;
>> int retval;
>>
>

2013-10-11 17:21:41

by Linus Torvalds

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Fri, Oct 11, 2013 at 4:13 AM, Rafael J. Wysocki <[email protected]> wrote:
> +/**
> + * slot_should_be_exposed - Check whether or not to expose a slot to userland.
> + * @bridge: ACPIPHP bridge the slot belongs to.
> + * @handle: ACPI handle of a device in the slot.
> + */
> +static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
> + acpi_handle handle)

Thanks, that looks much better.

I do worry that we now seem to add the slot to all the acpiphp lists
even if it is managed by pciehp. That gets rid of the warning Steven
saw (because now it always has that context), but I'm left wondering
how much pcihp and aciphp will fight over the slot.

Yes, the acpiphp_register_hotplug_slot() doesn't get called, but we
still do register_hotplug_dock_device(), for example. How does that
interact with pcihp that thinks it owns the slot?

Or am I misreading the code? It's more readable, and no longer makes
me homicidal, but I don't actually know the code itself.

Linus

2013-10-11 21:47:07

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Friday, October 11, 2013 10:21:35 AM Linus Torvalds wrote:
> On Fri, Oct 11, 2013 at 4:13 AM, Rafael J. Wysocki <[email protected]> wrote:
> > +/**
> > + * slot_should_be_exposed - Check whether or not to expose a slot to userland.
> > + * @bridge: ACPIPHP bridge the slot belongs to.
> > + * @handle: ACPI handle of a device in the slot.
> > + */
> > +static inline bool slot_should_be_exposed(struct acpiphp_bridge *bridge,
> > + acpi_handle handle)
>
> Thanks, that looks much better.
>
> I do worry that we now seem to add the slot to all the acpiphp lists
> even if it is managed by pciehp. That gets rid of the warning Steven
> saw (because now it always has that context), but I'm left wondering
> how much pcihp and aciphp will fight over the slot.
>
> Yes, the acpiphp_register_hotplug_slot() doesn't get called, but we
> still do register_hotplug_dock_device(), for example. How does that
> interact with pcihp that thinks it owns the slot?

Well, owning the slot doesn't really mean much here, because the "rescan"
and "remove" things may always be triggered by user space via sysfs from
under the PCI device in question (regardless of whether or not pciehp
thinks that it "owns" that device). So if they are triggered by an ACPI
notify instead, that should still be fine.

Ejects are more of a gray area, but they do the "remove" first and only
then they go for an actual "eject". Question is if we should execute
_EJ0 provided that it's actually present for the pciehp slots (which we will
do with the patch applied). It might be safer to trigger the native eject
then, but again I'd be surprised if _EJ0 didn't work anyway (if there is a
system in which _EJ0 is available for a device handled by pciehp in the first
place).

As far as docking stations go, the undock is done by ACPI anyway and it will
carry out "remove" for all devices under the dock, so the patch doesn't change
this particular case as far as I can say.

> Or am I misreading the code? It's more readable, and no longer makes
> me homicidal, but I don't actually know the code itself.

I think you're reading it correctly, it really makes acpiphp see all slots
even if pciehp sees them too. So the change is somewhat risky.

That said the risk doesn't seem to be huge and there seem to be cases in
which it actually would be useful to have both acpiphp and pciehp signaling
available for the same device. For example, even if the BIOS told us that
we could use the native mechanism (pciehp), it may not actually work. That is,
we may not get any hotplug interrupts from PCIe ports due to platform bugs of
some sort and we may get ACPI notifications instead (because the platform
designer knew about those bugs and thought it would be smart to use ACPI to
work around them).

There are bug reports indicating thinks like that, so we were going to allow
acpiphp and pciehp to handle the same devices anyway at one point. I thought
we might as well try to do it now and see how it goes. Still, if you think
it's too risky for this stage of the cycle, I'll just send a patch removing
the WARN_ON() and we'll revisit that thing in 3.13.

Rafael

2013-10-11 23:44:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [BUG] WARN_ON(!context) in drivers/pci/hotplug/acpiphp_glue.c

On Friday, October 11, 2013 11:58:48 PM Rafael J. Wysocki wrote:
> On Friday, October 11, 2013 10:21:35 AM Linus Torvalds wrote:
> > On Fri, Oct 11, 2013 at 4:13 AM, Rafael J. Wysocki <[email protected]> wrote:

[...]

> > Or am I misreading the code? It's more readable, and no longer makes
> > me homicidal, but I don't actually know the code itself.
>
> I think you're reading it correctly, it really makes acpiphp see all slots
> even if pciehp sees them too. So the change is somewhat risky.
>
> That said the risk doesn't seem to be huge and there seem to be cases in
> which it actually would be useful to have both acpiphp and pciehp signaling
> available for the same device. For example, even if the BIOS told us that
> we could use the native mechanism (pciehp), it may not actually work. That is,
> we may not get any hotplug interrupts from PCIe ports due to platform bugs of
> some sort and we may get ACPI notifications instead (because the platform
> designer knew about those bugs and thought it would be smart to use ACPI to
> work around them).
>
> There are bug reports indicating thinks like that, so we were going to allow
> acpiphp and pciehp to handle the same devices anyway at one point. I thought
> we might as well try to do it now and see how it goes. Still, if you think
> it's too risky for this stage of the cycle, I'll just send a patch removing
> the WARN_ON() and we'll revisit that thing in 3.13.

Having reconsidered this I think it's best to just drop the WARN_ON() for now
after all and sort out the coexistence between acpiphp and pciehp later, so
that we don't run into a weird corner case late in the cycle.

So the one below is what I'm going to do for 3.12.

Rafael

---
From: Rafael J. Wysocki <[email protected]>
Subject: ACPI / hotplug / PCI: Drop WARN_ON() from acpiphp_enumerate_slots()

The WARN_ON() in acpiphp_enumerate_slots() triggers unnecessarily for
devices whose bridges are going to be handled by native PCIe hotplug
(pciehp) and the simplest way to prevent that from happening is to
drop the WARN_ON().

References: https://bugzilla.kernel.org/show_bug.cgi?id=62831
Reported-by: Steven Rostedt <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/hotplug/acpiphp_glue.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
@@ -999,12 +999,13 @@ void acpiphp_enumerate_slots(struct pci_

/*
* This bridge should have been registered as a hotplug function
- * under its parent, so the context has to be there. If not, we
- * are in deep goo.
+ * under its parent, so the context should be there, unless the
+ * parent is going to be handled by pciehp, in which case this
+ * bridge is not interesting to us either.
*/
mutex_lock(&acpiphp_context_lock);
context = acpiphp_get_context(handle);
- if (WARN_ON(!context)) {
+ if (!context) {
mutex_unlock(&acpiphp_context_lock);
put_device(&bus->dev);
kfree(bridge);