commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
had intended to work around firmware problems on a Microsoft Surface
device but actually masked other real bugs in firmware and the driver.
Before this commit, "wake on lid" doesn't work properly on a number of
systems, but this is because debounce handling was improperly configured
in the driver and due to a bug in this commit it gets configured a
different way.
commit b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on
resume") attempted to build on top of this to mask issues on resume, but
it happened to "fix" the bug in commit 4e5a04be88fe ("pinctrl: amd:
disable and mask interrupts on probe") which "broke" wake on lid since
the debounce handling was programmed differently.
This was reverted in commit 534e465845eb ("Revert "pinctrl: amd: Disable
and mask interrupts on resume"") which fixed the wake on lid.
To fix this series of unfortunate events and prevent them in the future
this series corrects the GPIO0 debounce handling and reverts commit
4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe"). A new
patch that is safer is included that will fix spurious interrupt handling
and is expected to fix the issues that both
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
and
commit b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on
resume") attempted to fix in a more scalable way.
Kornel Dulęba (1):
pinctrl: amd: Detect and mask spurious interrupts
Mario Limonciello (3):
pinctrl: amd: Detect internal GPIO0 debounce handling
pinctrl: amd: Fix mistake in handling clearing pins at startup
pinctrl: amd: Revert "pinctrl: amd: disable and mask interrupts on
probe"
drivers/pinctrl/pinctrl-amd.c | 50 +++++++++--------------------------
drivers/pinctrl/pinctrl-amd.h | 1 +
2 files changed, 14 insertions(+), 37 deletions(-)
--
2.34.1
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
was well intentioned to mask a firmware issue on a surface laptop, but it
has a few problems:
1. It had a bug in the loop handling for iteration 63 that lead to other
problems with GPIO0 handling.
2. It disables interrupts that are used internally by the SOC but masked
by default.
3. It masked a real firmware problem in some chromebooks that should have
been caught during development but wasn't.
There has been a lot of other development around s2idle; particularly
around handling of the spurious wakeups. If there is still a problem on
the original reported surface laptop it should be avoided by adding a quirk
to gpiolib-acpi for that system instead.
Signed-off-by: Mario Limonciello <[email protected]>
---
drivers/pinctrl/pinctrl-amd.c | 31 -------------------------------
1 file changed, 31 deletions(-)
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 675c9826b78a..e9fef2391b38 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -877,34 +877,6 @@ static const struct pinconf_ops amd_pinconf_ops = {
.pin_config_group_set = amd_pinconf_group_set,
};
-static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
-{
- struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
- unsigned long flags;
- u32 pin_reg, mask;
- int i;
-
- mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) |
- BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF) |
- BIT(WAKE_CNTRL_OFF_S4);
-
- for (i = 0; i < desc->npins; i++) {
- int pin = desc->pins[i].number;
- const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin);
-
- if (!pd)
- continue;
-
- raw_spin_lock_irqsave(&gpio_dev->lock, flags);
-
- pin_reg = readl(gpio_dev->base + pin * 4);
- pin_reg &= ~mask;
- writel(pin_reg, gpio_dev->base + pin * 4);
-
- raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
- }
-}
-
#ifdef CONFIG_PM_SLEEP
static bool amd_gpio_should_save(struct amd_gpio *gpio_dev, unsigned int pin)
{
@@ -1142,9 +1114,6 @@ static int amd_gpio_probe(struct platform_device *pdev)
return PTR_ERR(gpio_dev->pctrl);
}
- /* Disable and mask interrupts */
- amd_gpio_irq_init(gpio_dev);
-
girq = &gpio_dev->gc.irq;
gpio_irq_chip_set_chip(girq, &amd_gpio_irqchip);
/* This will let us handle the parent IRQ in the driver */
--
2.34.1
From: Kornel Dulęba <[email protected]>
Leverage gpiochip_line_is_irq to check whether a pin has an irq
associated with it. The previous check ("irq == 0") didn't make much
sense. The irq variable refers to the pinctrl irq, and has nothing do to
with an individual pin.
On some systems, during suspend/resume cycle, the firmware leaves
an interrupt enabled on a pin that is not used by the kernel.
Without this patch that caused an interrupt storm.
Cc: [email protected]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217315
Signed-off-by: Kornel Dulęba <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
---
drivers/pinctrl/pinctrl-amd.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 24465010397b..675c9826b78a 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -660,21 +660,21 @@ static bool do_amd_gpio_irq_handler(int irq, void *dev_id)
* We must read the pin register again, in case the
* value was changed while executing
* generic_handle_domain_irq() above.
- * If we didn't find a mapping for the interrupt,
- * disable it in order to avoid a system hang caused
- * by an interrupt storm.
+ * If the line is not an irq, disable it in order to
+ * avoid a system hang caused by an interrupt storm.
*/
raw_spin_lock_irqsave(&gpio_dev->lock, flags);
regval = readl(regs + i);
- if (irq == 0) {
- regval &= ~BIT(INTERRUPT_ENABLE_OFF);
+ if (!gpiochip_line_is_irq(gc, irqnr + i)) {
+ regval &= ~BIT(INTERRUPT_MASK_OFF);
dev_dbg(&gpio_dev->pdev->dev,
"Disabling spurious GPIO IRQ %d\n",
irqnr + i);
+ } else {
+ ret = true;
}
writel(regval, regs + i);
raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
- ret = true;
}
}
/* did not cause wake on resume context for shared IRQ */
--
2.34.1
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
had a mistake in loop iteration 63 that it would clear offset 0xFC instead
of 0x100. Offset 0xFC is actually `WAKE_INT_MASTER_REG`. This was
clearing bits 13 and 15 from the register which significantly changed the
expected handling for some platforms for GPIO0.
Cc: [email protected]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217315
Signed-off-by: Mario Limonciello <[email protected]>
---
drivers/pinctrl/pinctrl-amd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 6b9ae92017d4..24465010397b 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -897,9 +897,9 @@ static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
raw_spin_lock_irqsave(&gpio_dev->lock, flags);
- pin_reg = readl(gpio_dev->base + i * 4);
+ pin_reg = readl(gpio_dev->base + pin * 4);
pin_reg &= ~mask;
- writel(pin_reg, gpio_dev->base + i * 4);
+ writel(pin_reg, gpio_dev->base + pin * 4);
raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
}
--
2.34.1
[Public]
> commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
> was well intentioned to mask a firmware issue on a surface laptop, but it
> has a few problems:
> 1. It had a bug in the loop handling for iteration 63 that lead to other
> problems with GPIO0 handling.
> 2. It disables interrupts that are used internally by the SOC but masked
> by default.
> 3. It masked a real firmware problem in some chromebooks that should have
> been caught during development but wasn't.
>
> There has been a lot of other development around s2idle; particularly
> around handling of the spurious wakeups. If there is still a problem on
> the original reported surface laptop it should be avoided by adding a quirk
> to gpiolib-acpi for that system instead.
>
> Signed-off-by: Mario Limonciello <[email protected]>
Sachi,
Any feedback on this series, particularly on this patch? I'd like to understand
what GPIO pins prompted the original patch and if 3/4 of this series helps the
original behavior so that we can have confirmation the revert is not going to
cause a regression for you.
Thanks,
> ---
> drivers/pinctrl/pinctrl-amd.c | 31 -------------------------------
> 1 file changed, 31 deletions(-)
>
> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
> index 675c9826b78a..e9fef2391b38 100644
> --- a/drivers/pinctrl/pinctrl-amd.c
> +++ b/drivers/pinctrl/pinctrl-amd.c
> @@ -877,34 +877,6 @@ static const struct pinconf_ops amd_pinconf_ops = {
> .pin_config_group_set = amd_pinconf_group_set,
> };
>
> -static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
> -{
> - struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
> - unsigned long flags;
> - u32 pin_reg, mask;
> - int i;
> -
> - mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) |
> - BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF)
> |
> - BIT(WAKE_CNTRL_OFF_S4);
> -
> - for (i = 0; i < desc->npins; i++) {
> - int pin = desc->pins[i].number;
> - const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl,
> pin);
> -
> - if (!pd)
> - continue;
> -
> - raw_spin_lock_irqsave(&gpio_dev->lock, flags);
> -
> - pin_reg = readl(gpio_dev->base + pin * 4);
> - pin_reg &= ~mask;
> - writel(pin_reg, gpio_dev->base + pin * 4);
> -
> - raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
> - }
> -}
> -
> #ifdef CONFIG_PM_SLEEP
> static bool amd_gpio_should_save(struct amd_gpio *gpio_dev, unsigned int
> pin)
> {
> @@ -1142,9 +1114,6 @@ static int amd_gpio_probe(struct platform_device
> *pdev)
> return PTR_ERR(gpio_dev->pctrl);
> }
>
> - /* Disable and mask interrupts */
> - amd_gpio_irq_init(gpio_dev);
> -
> girq = &gpio_dev->gc.irq;
> gpio_irq_chip_set_chip(girq, &amd_gpio_irqchip);
> /* This will let us handle the parent IRQ in the driver */
> --
> 2.34.1
On Fri, Apr 21, 2023 at 2:06 PM Mario Limonciello
<[email protected]> wrote:
> commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
> had intended to work around firmware problems on a Microsoft Surface
> device but actually masked other real bugs in firmware and the driver.
>
> Before this commit, "wake on lid" doesn't work properly on a number of
> systems, but this is because debounce handling was improperly configured
> in the driver and due to a bug in this commit it gets configured a
> different way.
>
> commit b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on
> resume") attempted to build on top of this to mask issues on resume, but
> it happened to "fix" the bug in commit 4e5a04be88fe ("pinctrl: amd:
> disable and mask interrupts on probe") which "broke" wake on lid since
> the debounce handling was programmed differently.
>
> This was reverted in commit 534e465845eb ("Revert "pinctrl: amd: Disable
> and mask interrupts on resume"") which fixed the wake on lid.
>
> To fix this series of unfortunate events and prevent them in the future
> this series corrects the GPIO0 debounce handling and reverts commit
> 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe"). A new
> patch that is safer is included that will fix spurious interrupt handling
> and is expected to fix the issues that both
> commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
> and
> commit b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on
> resume") attempted to fix in a more scalable way.
I applied the series for the next kernel cycle (becoming v6.5).
If it gets urgent to get this in we can move it into the fixes (for v6.4 and
stable) but at least this way we get some rotation in linux-next and wide
testing of the patches.
I will push it once v6.4-rc1 is out.
Yours,
Linus Walleij
On 5/5/2023 6:43 AM, Linus Walleij wrote:
> On Fri, Apr 21, 2023 at 2:06 PM Mario Limonciello
> <[email protected]> wrote:
>
>> commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
>> had intended to work around firmware problems on a Microsoft Surface
>> device but actually masked other real bugs in firmware and the driver.
>>
>> Before this commit, "wake on lid" doesn't work properly on a number of
>> systems, but this is because debounce handling was improperly configured
>> in the driver and due to a bug in this commit it gets configured a
>> different way.
>>
>> commit b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on
>> resume") attempted to build on top of this to mask issues on resume, but
>> it happened to "fix" the bug in commit 4e5a04be88fe ("pinctrl: amd:
>> disable and mask interrupts on probe") which "broke" wake on lid since
>> the debounce handling was programmed differently.
>>
>> This was reverted in commit 534e465845eb ("Revert "pinctrl: amd: Disable
>> and mask interrupts on resume"") which fixed the wake on lid.
>>
>> To fix this series of unfortunate events and prevent them in the future
>> this series corrects the GPIO0 debounce handling and reverts commit
>> 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe"). A new
>> patch that is safer is included that will fix spurious interrupt handling
>> and is expected to fix the issues that both
>> commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
>> and
>> commit b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on
>> resume") attempted to fix in a more scalable way.
> I applied the series for the next kernel cycle (becoming v6.5).
>
> If it gets urgent to get this in we can move it into the fixes (for v6.4 and
> stable) but at least this way we get some rotation in linux-next and wide
> testing of the patches.
>
> I will push it once v6.4-rc1 is out.
>
> Yours,
> Linus Walleij
Thanks! Especially given the regression we were dealing with before I
think this is the right
approach.